Skip Headers
Oracle® Enterprise Manager Cloud Control Administrator's Guide
12c Release 1 (12.1.0.1)

Part Number E24473-01
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
PDF · Mobi · ePub

2 Incident Management

Incident and problem management allows you to monitor and resolve service disruptions quickly and efficiently.

This chapter covers the following topics:

Overview: Monitoring and Managing Via Incidents

Enterprise Manager Cloud Control 12c greatly expands target monitoring and management capability beyond previous releases by letting you focus on what is important from a broader monitoring/management perspective rather than focusing on discrete events that may or may not be relevant to a particular situation.

Note:

Also available is the mobile application for managing the incidents and problems, on the go. For more information, see Chapter 9, "Cloud Control Mobile".

What is an event?

An event is a discrete occurrence detected by Enterprise Manager related to one or more managed entities at a particular point in time which may indicate normal or problematic behavior. Examples of events include: database target going down, performance threshold violation, change in application configuration files, successful completion of job execution, or job failure.

Previous versions of Enterprise Manger generated alerts for exception conditions--metric alerts. For Enterprise Manager 12c,metric alerts are a type of event , one of many different event types. This event model significantly raises the number of of conditions in an IT infrastructure for which Enterprise Manager can detect and raises events.

What is an incident?

An incident is the situation or issue you need to act on. By definition, an incident is an event or a set of closely correlated events that represent an observed issue requiring resolution through (manual or automated) immediate action or root-cause problem resolution.

When you create an incident, you define a macroscopic set of conditions that you want to monitor and/or manage. In general, you should not need to modify an incident once it is defined, but only the events that make up the incident. An incident may consist of a single event, as might be the case when you are only interested in whether a single database is up or down, or something more complex with multiple events as would be the case when monitoring host resources where you are interested in a variety of metric alert conditions such as disk space utilization, CPU load.

Managing your environment via incidents is carried out through Incident Manger, Enterprise Manager's new console which provides you with a centralized location from which to view, manage, diagnose and resolve incidents as well as identify, resolve and eliminate the root cause of disruptions. See "Incident Manager Console" for more information.

Events

By definition, an event is a significant occurrence within your IT infrastructure that Enterprise Manager can detect and subsequently notify interested parties or take action on. An event has very specific attributes that allow Enterprise Manager (and ultimately an Enterprise Manager administrator) to identify, categorize, and manage the event. All events have the following attributes

Event Types

Previous versions of Enterprise Manager let you to monitor and manage by discrete signals and notified you by raising a metric alert as a result of threshold violations. For Enterprise Manager 12c, a metric alert is now just one of several categories of event conditions for which Enterprise Manager can monitor. These event conditions are called Event Types. As shown in the following list, the range of events types greatly expands Enterprise Manager's monitoring flexibility:

Incidents allow you to manage many discrete event types by providing an intuitive way to combine them into meaningful issues that you can act upon.

Event Severity

Another event attribute is severity. Just as previous versions of Enterprise Manager utilized metric alert severity levels, this concept has been extended to all event types. Events can have the following severity levels:

Incidents

A previously mentioned, you monitor and manage your Enterprise Manager environment via incidents and not discrete events (even though an incident can conceivably consist of a single event). Managing by incident means rather than managing discrete events for your system (for example, a single target metric warning threshold being exceeding), you now manage an incident that may consist of one significant event (for example, a target down event) or combination of related events (for example, multiple metric thresholds being exceeded). Incidents add an intuitive layer of abstraction that allows you to manage your monitored systems more efficiently. Once you define an incident, it becomes Enterprise Manager's job to monitor for the specified events of interest. This allows you to quickly identify, resolve, and eliminate root causes of monitored system disruptions.

When an incident is created, Enterprise Manager makes available a variety of functions that covers the incident management workflow allowing you to manage and track the incident through its complete lifecycle. Incident management functions allow you to:

All incident management/tracking operations are carried out from the Incident Manager console. Incident management can be automated using incident rules.

The following examples illustrate how incidents are constructed and how attributes map to various stages of the incident lifecycle.

Incident Composed of a Single Event

The simplest incident is composed of a single event. Since the incident serves as a wrapper for related events, in this case the incident and the event can be considered one in the same.

In the following example, you are concerned about the availability of a single, very important database. Hence, you are only interested in a single event, which in this case is a database availability event. You create an incident whereby if the database encounters problems, Enterprise Manager will raise an event and open the incident. Once open, you will have available all incident management functionality required to track and manage its resolution.

Figure 2-1 Incident with a Single Event

graphic illustrates an incident with a single event.
Description of "Figure 2-1 Incident with a Single Event"

The figure shows how both the incident and event attributes are used to help you manage the incident. From the figure, we see that the database DB1 has gone down and an event of Fatal severity has been raised. An incident is opened and the owner/administrator Scott is currently working to resolve the issue. The incident severity is currently Fatal as the incident inherits the worst severity of all the events within incident. In this case there is only one event associated with the incident so the severity is Fatal.

As an open incident, you can use the Incident Manager console to track its ownership, its resolution status, set the priority, if necessary, add annotations to the incident to share information with others when working in a collaborative environment. In addition, you have direct access to pertinent information from MOS and links to other areas of Enterprise Manager that will help you resolve the database problem.

Incident Composed of Multiple Events

Situations of interest may involve more than a single event. It is the incident's ability to monitor for multiple events that demonstrates the power and flexibility of monitoring and managing via incidents rather than discrete events.

Figure 2-2 Incident with Multiple Events

graphic shows a incident with multiple events
Description of "Figure 2-2 Incident with Multiple Events"

As shown in the figure, the incident is newly opened and has not yet been acknowledged. The incident severity is Critical even though one of the events (Memory Utilization) is only at a Warning severity level. Incidents inherit the worst severity of all the events within incident. The incident summary indicates why this incident should be of interest, in this case, "Machine Load is high". This message is an intuitive indicator for all administrators looking at this incident. By default, the incident summary is pulled from the message of the event, however, this message can be changed by any administrator working on the incident.

Because administrators are interested in overall machine load, administrator Sam has created an incident for these two metric events because they are related--together they represent a host overload situation. An administrator needs to take action because memory is filling up and consumed CPU resource is too high. In its current state, this condition will impact any applications running on the host.

Helpdesk Incident Resolution

If your IT group relies on a helpdesk to resolve this host overload issue, you will want the incident to file a helpdesk ticket in order to have you helpdesk team manage the incident. Here, you can use the ticketing connector to integrate the incident with a helpdesk ticket. It will automatically open a ticket when the incident is created in addition to tracking the ticket ID, and status of the ticket. This provides administrators with a way, from within Enterprise Manager, to view some ticket attributes and not have to accesses third-party helpdesk console. Enterprise Manager also allows you to link out to a Web-based third-part console directly from the ticket so that you can launch the console in context directly from the ticket.

Incident Rule Sets

With previous versions of Enterprise Manager, you used notification rules to choose the individual targets and conditions for which you want to perform actions and/or receive notifications (send e-mail, page, open a trouble ticket) from Enterprise Manager. For Enterprise Manager 12c, the concept and function of notification rules has been replaced with a two-tier system consisting of Incident Rules and Incident Rule Sets.

An incident rule set is a set of one or more rules that apply to a common set of objects such as targets (hosts, databases, groups), jobs, or Web applications. The set of objects to which the rule set applies do not have to be of the same type. A rule set allows you to logically combine different rules relating to the common set of objects (such as jobs, targets, applications) into a single manageable unit. Operationally, rules within a rule set are executed in a specified order as are the rule sets themselves. The following figure shows typical rule set structure and now the individual rules are applied to a heterogeneous group of targets.

Figure 2-3 Incident Rule Set Application

Graphic shows the applications of an incident rule set.
Description of "Figure 2-3 Incident Rule Set Application"

 

Out-of-Box Rule Sets

Enterprise Manager provides out-of-box rule sets for incident creation, event deletion based on typical scenarios. The following rule sets are immediately available upon installation.

Incident Management Rule Sets for All Targets

  • Incident creation Rule for target down.

  • Incident creation Rule for target unreachable (for Agents and hosts).

  • Incident creation Rule for metric alerts (for critical severity only).

  • Out-of-box Incident creation rule for Service Level Agreement Alerts.

  • Incident creation rule for compliance score violation

  • Incident creation rule for high-availability events.

  • Auto-clear Rule for metric alerts older than 7 days.

  • Auto clear Rule for job status change terminal status events older than 7 days.

  • Clear Application Dependency and Performance (ADP) alerts after without incidents after 7 days.

Event Management Rule Set for Self-Update

  • Notification Rule for new updates

Note:

Out-of-Box rule sets cannot be deleted. They can only be disabled or updated.

Some examples of the types of actions that a rule can perform are:

  • Create an incident based on an event

  • Perform notification actions such as generating a helpdesk ticket

  • Perform actions to manage incident workflow notification via e-mail/PL/SQL methods/ SNMP traps. For example, if a problem occurs on a database, send e-mail to administrator Joe. If the incident remains unacknowledged for more than 2 days, escalate the incident.

Incident Rule Set Types

There are two types of Incident Rule Sets:

  • Enterprise: Used to implement all operational practices within your IT organization. All supported actions are available for this type of rule set. However, because this type of rule set can perform all actions, there are restrictions as to who can create an enterprise rule set.

    In order to create or edit an enterprise rule set, an administrator must have been granted the "Create Enterprise Rule Set " privilege on the "Enterprise Rule Set" resource. An enterprise rule set can have multiple authors, however, if the originator of the rule set wants other administrators to edit the rule set, he will need to share access in order to work collaboratively. Incident rule sets are visible to all administrators.

  • Private: If an administrator does not have the Create Enterprise Rule Set resource privilege and consequently cannot create an enterprise rule set, but wants to be notified about something he is monitoring, he can create a private rule set. The only action a private rule set can perform is to send e-mail to the rule set owner. Any administrator can create a private rule set.

Rules

Rules are instructions within a rule set that automate actions on incoming events or incidents or problems (specialized incidents for Oracle software). Because rules operate on incoming incidents/events/problems, if you create a new rule, it will not act retroactively on incidents/events/problems that have already occurred.

Every rule is composed of two parts:

  • Criteria: The events/incidents/problems on which rule applies.

  • Action(s): The ordered set of one or more operations on the specified events/ incidents/ problems. Each action can be executed based on additional conditions.

The following table illustrates how rule criteria and actions determines rule application:

Table 2-1 Rule Operation

Rule Criteria Rule Action

Condition Actions

CPU Util(%), Tablespace Used(%) metric alert events of warning or critical severity

 

Create incident.

Incidents of warning or critical severity

If severity = critical

If severity =warning

Notify by page

Notify by e-mail

Incidents open for more than 7 days

 

Set escalation level to 1


From the rule operation example shown in the table, the rule applies to two metric alert events: CPU Utilization and Tablespace Used. Whenever these events reach either Warning or Critical severity threshold levels, an incident is created. Additional conditions have been added to the rule criteria that determines what actions are to be taken. When the incident severity level (the incident severity is inherited from the worst event severity) reaches Warning, Enterprise Manager sends an e-mail to the administrator. If the incident severity level reaches Critical, Enterprise Manager sends a page to the administrator.

Incident Rule Set Guidelines

When creating incident rule sets, adhering to the following guideline will result in efficient use of system resource as well as operational efficiency.

  • For rule sets that operate on targets (for example, hosts and databases), use groups to consolidate all targets into a single target for the rule set.

  • Consolidate all rules that apply to the group members within the same rule set.

  • Leverage the execution order of rules within the rule set

When deciding how to use different rule types within the rule set, adhere to the following rule usage guidelines:

Table 2-2 Incident Rule Usage Guidelines

Rule Usage Application

Rules on Events…

To create incidents for the alerts/events managed in Enterprise Manager

To create tickets for incidents managed by helpdesk analysts

Create Incidents based on event, then create ticket for the incident

Send events to third party management systems

To send notifications on events (no incident created)

Rules on Incidents

Automate management of incident workflow (assign owner, set priority, escalation levels..) and send notifications

Create tickets based on incident conditions. For example, create a ticket if the incident is escalated to level 2.

Rules on Problems

Automate management of problem workflow (assign owner, set priority, escalation levels..) and send notifications


Incident Rule Set Example

The following example illustrates many of the implementation guidelines just discussed. All targets have been consolidated into a single group, all rules that apply to group members are part of the same rule set, and the execution order of the rules has been

All rules in the rule set perform three types of actions: incident creation, notification, and escalation.

graphic shows an example rule set containing 3 rules.

In a more detailed view of the rule set, we can see how the guidelines have been followed.

graphic shows a detailed view of the rule set where actual rules have been added.

In this detailed view, there are five rules that apply to all group members. The execution sequence of the rules (rule 1 - rule 5) has been leveraged to correspond to the three types of rule actions in the rule set: Rules 1-3

  • Rules 1-3: Incident Creation

  • Rule 4: Notification

  • Rule 5: Escalation

By synchronizing rule execution order with the progression of rule action categories, maximum efficiency is achieved.

Event Prioritization

When working in a large enterprise it is conceivable that when systems are under heavy load, an extraordinarily large number of incidents and events will be generated. All of these need to be processed in a timely and efficient manner in accordance with your business priorities. To have them processed sequentially can result in long waits before incidents can be resolved: High priority events/incidents need to be addressed before those of low priority.

In order to determine which event/incidents are high priority, Enterprise Manager uses a prioritization protocol based on two incident/event attributes: Lifecycle Status of the target and the Incident/Event Type. A target's Lifecycle Status is set when it is added to Enterprise Manager for monitoring. At that time, you determine where in the prioritization hierarchy that target belongs; the highest level being "mission critical" and the lowest being "development."

Target Lifecycle Status

  • Mission Critical (highest priority)

  • Production

  • Sage

  • Test

  • Development (lowest priority)

Incident/Event Type

  • Availability (highest priority)

  • All events/incidents (Fatal severity)

  • All events/incidents (Warning and Critical severities)

  • All events/incidents (Informational) (lowest priority)

Problems

A problem represents the underlying root cause of the incident requires further analysis beyond the immediate resolution of the incident. For Enterprise Manager 12c, problems focus on the diagnostic incidents and problems diagnostic incidents/problems generated by "Advanced Diagnostic Repository (ADR)". Because the Support Workbench problems and diagnostic incidents are propagated to Incident Manager, you can perform additional tracking such as viewing problems across different databases. A problem represents the root cause of all the Oracle software incidents.

When a problem is raised for Oracle software, Oracle has determined that the only recourse is to open an SR, send support the diagnostic logs, and eventually provide a patch. As an incident, Enterprise Manager makes available all tracking, diagnostic, and reporting functions for problem management. Whenever you view all open incidents and problems, whether you are using the Incident Manager console, or in context of a target/group home page, you can easily determine what issues are actually affecting your database.

The following figure shows the tracking and diagnostic functionality available for problems from the Incident Manger console.

Incident Manager Console

Incident Manager provides, in one location, the ability to search, view, manage, and resolve incidents and problems impacting your environment. Use Incident Manager to perform the following tasks:

Figure 2-5 incident Manager Console

graphic shows the incident manager console.
Description of "Figure 2-5 incident Manager Console"

The advantages of using Incident Manager include:

Moving from Enterprise Manager 10/11g to 12c

Enterprise Manager 12c incident management/monitoring functionality leverages your existing pre-12c monitoring setup out-of-box. Migration is seamless and transparent. For example, if your Enterprise Manager 10/11g monitoring system sends you e-mails based on specific monitoring conditions, you will continue to receive those e-mails without interruption. To take advantage of 12c features, however, you may need to perform additional migration tasks.

Important:

Alerts that were generated pre-12c will still be available.

Incident Rules

When you migrate to Enterprise Manger 12c, all of your existing notification rules are automatically converted to incident rules. Technically, they are converted to event rules first with incidents automatically being created for each event rule.

In general, event rules allow you to define which events should become incidents. However, they also allow you to take advantage of the Enterprise Manager's increased monitoring flexibility: While an incident is open, if you want to monitor the status of individual events within the incident, you can utilize individual event rules as a way to obtain those notifications.

For more information on incident rule migration, see the following documents:

Privilege Requirements

The 'Rule Set' resource privilege is now required in order to edit/create enterprise rule sets and rules contined within. The exception to this is migrated notification rules. When pre-12c notification rules are migrated to event rules, the original notification rule owners will still be able to edit their own rules without having been granted the Create Enterprise Rule Set resource privilege. However, they must be granted the 'Rule Set' resource privilege if they wish to create new rules. Enterprise Manager Super Administrators, by default, can edit and create rule sets.

Before Working with Incidents

Before using Incident Manager, ensure all relevant Enterprise Manager administrator accounts have been granted the appropriate privileges to manage incidents. Also, ensure that the notification system is properly configured to allow automated notification for incidents.

Granting User Privileges for Events, Incidents and Problems

Users are granted privileges for events, incidents, and problems in the following situations:

For events, two privileges are defined:

For incidents, two privileges are defined:

For problems, two privileges are defined:

Working with Incidents

You can perform the following tasks using Incident Manager:

Setting Up Incident Views

Incident views allow you to save commonly used incident search criteria for repeated use. You can set up a filter in Incident Manager to view incidents for targets managed by, for example, targets managed by a specific group of administrators.

By specifying preferences to view the following for each of the incidents in the list: incident severity, incident message, acknowledgement flag, date the incident triggered, administrator assigned to it, resolution status, priority, escalated flag, ID, and category, you can filter extraneous incidents. Once the view preference is saved, Enterprise Manager will display only the list of matching incidents.

You can then search the incidents for only the ones with specific attributes, such as priority P1. While reviewing the incidents, you can specify one-click access to this list so that it can be easily accessed for daily triaging activity. Accordingly, you can save the search criteria as a filter named "All P1 incidents for my targets". The filter becomes available in the UI for immediate use. The filter will show up anytime you log in to access the specific incidents quickly.

Perform the following steps:

  1. Navigate to the Incident Manager page.

    From the Enterprise menu on the Enterprise Manager home page, select Monitoring, then select Incident Manager.

  2. In the Views region located on the left, click Search.

    1. In the Search region, search for Incidents using the Type list and select Incidents.

    2. In the Criteria region, choose all the criteria that are appropriate. To add fields to the criteria, click Add Fields... and select the appropriate fields.

    3. After you have provided the appropriate criteria, click Get Results.

    4. To view all the columns associated with this table, in the View menu, select Columns, then select Show All.

      Validate that the list of incidents match what you are looking for. If not, change the search criteria as needed.

    5. Click the Create View... button.

Using Views to Filter Incidents, Problems, and Events

A view is a set of search criteria for filtering incidents and problems in the system. You can define views to help you gain quick access to the incidents and problems on which you need to focus. For example, you may define a view to display all the incidents associated with the production databases that you own.

Responding and Working on a Simple Incident

Before you begin working on resolving an incident, ensure your Enterprise Manager account has been granted the appropriate privileges to manage incidents from your managed system.

  • Privileges on events are calculated based on the privilege on the underlying source objects. For example, the user will have VIEW privilege on an event if he can view the target for the event.

  • Privileges on an incident are calculated based on the privileges on participating events.

  • Similarly, problem privileges are calculated based on privileges on underlying incidents.

Perform the following steps:

  1. Navigate to Incident Manager.

    From the Enterprise menu on the Enterprise Manager home page, select Monitoring, then select Incident Manager.

  2. To view incidents assigned to you, in the table look at the Owner column. If the Owner column is not displayed, in the View menu select Columns, then select Owner.

    Work on the incident with the highest priority. Be aware that as you are working on an individual incident, new incidents might be coming in. Update the list of incidents by clicking the Refresh icon.

  3. To work on an incident, click the incident. In the General section, click Manage and change the fields as appropriate. For example, set the status to Work in Progress and in the Owner field, type your name.

  4. If the solution for the incident is unknown, use one or all of the following methods made available in the Incident page:

    • Use the Guided Resolution region and access any recommendations, diagnostic and resolution links available.

    • Check My Oracle Support Knowledge base for known solutions for the incident.

    • Study related incidents available through the Related Events and Incidents tab.

  5. Once the solution is known and can be resolved right away, resolve the incident by using tools provided by the system, if possible.

  6. In most cases, once the underlying cause has been fixed, the incident is cleared in the next evaluation cycle. However, in cases like "log based" incidents, clear the event.

Suppressing Incidents and Problems

There are times when it is convenient to hide an incident or problem from the list in the All Open Incidents page or the All Open Problems page. For example, you may want to suppress an incident is while the incident is being actively worked on and you do not need to be notified.

To suppress an incident or problem:

  1. Navigate to the Incident Manager page.

    From the Enterprise menu on the Enterprise Manager home page, select Monitoring, then select Incident Manager.

  2. Select either the All Open Incidents view or the All Open Problems view. Choose the appropriate incident or problem. Click the General tab.

  3. In the resulting Details region, click More, then select Suppress.

  4. On the resulting Suppress pop-up, choose the appropriate suppression type. Add a comment if desired. Click OK.

Searching My Oracle Support Knowledge

To access My Oracle Support Knowledge base entries from within Incident Manager, perform the following steps:

  1. Navigate to the Incident Manager page.

    From the Enterprise menu on the Enterprise Manager home page, select Monitoring, then select Incident Manager.

  2. Select one of the standard views. Choose the appropriate incident or problem in the View table.

  3. In the resulting Details region, click My Oracle Support Knowledge. Sign in to My Oracle Support.

  4. On the My Oracle Support page, click the Knowledge tab to browse the knowledge base and knowledge alerts.

Open Service Request

There are times when you may need assistance from Oracle Support to resolve a problem. To submit a service request (SR), perform the following steps:

  1. Navigate to the Incident Manager page.

    From the Enterprise menu on the Enterprise Manager home page, select Monitoring, then select Incident Manager.

  2. Select one of the standard views. Choose the appropriate problem from table.

  3. In the resulting Details region, click My Oracle Support Knowledge. Sign in to My Oracle Support if you are not already signed in.

  4. On the My Oracle Support page, click the Service Requests tab.

  5. Click Create SR button. Click Help to learn how to create a new SR.

Incident Manager - Advanced Tasks

You can perform the following advanced tasks using Incident Manager:

Creating an Incident Manually

To create an incident manually, perform the following:

  1. Create incident in context of an event.

  2. Enter details and save the incident.

  3. Set yourself as owner of the incident and update status to Work in Progress.

Example Scenario

As per the operations policy, the DBA manager has setup rules to create incidents for all critical issues for his databases. The remainder of the issues are triaged at the event level by one of the DBAs.

One of the DBA receives e-mail for an "SQL Response" event (not associated with an incident) on the production database. He accesses the details of the event by clicking on the link in the e-mail. He reviews the details of the event. This is an issue that needs to be tracked and resolved, so he opens an incident to track the resolution of the issue. He marks the status of the incident as "Work in progress".

Managing Workload Distribution of Incidents

Incident Manager enables you to manage incidents and problems to be addressed by your team

Perform the following tasks:

  1. Navigate to Incident Manager.

    From the Enterprise menu on the Enterprise Manager home page, select Monitoring, then select Incident Manager.

  2. Use the standard or custom views to identify the incidents for which your team is responsible. You may want to focus on unassigned and unacknowledged incidents and problems.

  3. Review the list of incidents. This includes: determining person assigned to the incident, checking its status, progress made, and actions taken by the incident owner.

  4. Add comments, change priority, re-assign the incident as needed by clicking on the Manage button in the Incident Details region.

Example Scenario

The DBA manager uses Incident Manager to view all the incidents owned by his team. He ensures all of them are correctly assigned; if not, he re-assigns and prioritizes them appropriately. He monitors the escalated events for their status and progress, adds comments as needed for the owner of the incident. In the console, he can view how long each of the incidents has been open. He also reviews the list of unassigned incidents and assigns them appropriately.

Managing and Automating Incident Workflow

Data centers follow operational practices that enable them to manage events and incidents by business priority and in a collaborative manner. Enterprise Manager provides the following features to enable this management and automation:

  • Sending notifications to the appropriate administrators

  • Assigning initial ownership of an incident and perhaps transferring ownership based on shift assignments or expertise

  • Tracking its resolution status

  • Assigning priorities based on the component affected and nature of the incident

  • Escalating incidents in order to meet service level agreements (SLA)

  • Accessing My Oracle Support knowledge articles

  • Opening Oracle Service Requests to request assistance with problems with Oracle software

  • Generating management and operational reports to track the status of incidents

You can manage an incident by doing the following:

  1. In the All Open Incidents view, click the incident.

  2. In the resulting Details page, click the General tab, then click Manage.

    You can then adjust the priority, escalate the incident, and assign it to a specific engineer.

Set Up Tasks to Perform Before Using Incident Rules

Before you use incident rules, ensure the following prerequisites have been set up:

To perform these tasks, click Setup on the Enterprise Manager home page, select Security, then select Administrators to access the Administrators page.

Graphic displays the adminstrators page.

Privileges Required for Enterprise Rule Sets

As the owner of the rule set, an administrator can perform the following:

If an incident or problem rule has an update action (for example, change priority), it will take the action only if the owner of the respective rule set has manage privilege on the matching incident or problem.

To acquire privileges, click Setup on the Enterprise Manager home page, select Security, then select Administrators to access the Administrators page. Select an administrator from the list and click Edit to access the Administrator properties wizard as shown in the following graphic.

graphic shows the administrator edit wizard.

Working with Incident Rules

You can perform the following tasks using Incident Rules:

Setting Up the Monitoring Environment by Defining Incident Rules

One way to set up your monitoring environment is by defining incident rules. You can set up rules to:

  1. Create incident in response to an event

  2. Send notifications to different users

  3. Manage escalation of incidents and problems

  4. Create ticket for incidents

  5. Notify different administrators for different classes of events

  6. Create notification subscription to existing Enterprise Rules

Creating an Incident Rule

To create an incident rule, perform the following steps:

  1. Navigate to the Incident Rules page.

    From the Setup menu located at the top-right of the Enterprise Manager home page, select Incidents, then select Incident Rules.

  2. On the Incident Rules - All Enterprise Rules page, edit the existing rule set (highlight the rule set and click Edit...) or create a new rule set. Rules are created in the context of a rule set!

  3. In the Rules tab of the Edit Rule Set page, click Create... and select the type of rule to create (Event, Incident, Problem) on the Select Type of Rule to Create page. Click Continue.

  4. In the Create New Rule wizard, provide the required information. Click Help for information regarding the wizard pages.

  5. Once you have finished defining the rule, click Continue to add the rule to the rule set. Click Save to save the changes made to the rule set.

Creating a Rule to Create an Incident

To create a rule that creates an incident, perform the following steps:

  1. Navigate to the Incident Rules page.

    From the Setup menu located at the top-right of the Enterprise Manager home page, select Incidents, then select Incident Rules.

  2. Determine whether there is an existing rule set that contains a rule that manages the event. In the Incident Rules page, use the Search option to find the events for the target and the associated rule set.

    Note: In the case where there is no existing rule set, create a rule set by clicking Create Rule Set... You then create the rule as part of creating the rule set.

  3. Select the rule set that will contain the new rule. Click Edit... In the Rules tab of the Edit Rule Set page,

    1. Click Create ...

    2. Select "Incoming events and updates to events"

    3. Click Continue.

    select Actions, then select Add rule for event... (or click Create..., select Event Rule, and click Continue).

    Provide the rule details using the Create New Rule wizard.

    1. Select the Event Type the rule will apply to, for example, Metric Alert. (Metric Alert is available for rule sets of the type Targets.) You can then specify metric alerts by selecting Specific Metrics. The table for selecting metric alerts displays. Click the +Add button to launch the metric selector. On the Select Specific Metric Alert page, select the target type, for example, Database Instance. A list of relevant metrics display. Select the ones in which you are interested. Click OK.

      You also have the option to select the severity and corrective action status.

    2. Once you have provided the initial information, click Next. Click +Add to add the actions to occur when the event is triggered. One of the actions is to Create Incident.

      As part of creating an incident, you can assign the incident to a particular user, set the priority, and create a ticket. Once you have added all the conditional actions, click Continue.

    3. After you have provided all the information on the Add Actions page, click Next to specify the name and description for the rule. Once on the Review page, verify that all the information is correct. Click Back to make corrections; click Continue to return to the Edit (Create) Rule Set page.

    4. Click Save to ensure that the changes to the rule set and rules are saved to the database.

  4. Test the rule by generating a metric alert event on the metrics chosen in the previous steps.

Creating a Rule to Manage Escalation of Incidents

Before you set up a rule to manage escalations, ensure the following prerequisite task has been performed:

  • DBA has setup appropriate thresholds for the metric so that critical metric alert is generated as expected.

Perform the following steps:

  1. Navigate to the Incident Rules page.

    From the Setup menu located at the top-right of the Enterprise Manager home page, select Incidents, then select Incident Rules.

  2. Determine whether there is an existing rule set that contains a rule that manages the incident. In the Incident Rules page, use the Search option to find the incident and the associated rule set.

    Note: In the case where there is no existing rule set, create a rule set by clicking Create Rule Set... You then create the rule as part of creating the rule set.

  3. Select the rule set that will contain the new rule. Click Edit... In the Rules tab of the Edit Rule Set page, and then:

    1. Click Create ...

    2. Select "Incoming events and updates to events"

    3. Click Continue.

  4. For demonstration purposes, the escalation is in regards to a production database.

    As per the organization's policy, the DBA manager is notified for escalation level 1 incidents. Similarly, the DBA director and operations VP are paged for incidents escalated to levels 2 and 3 respectively.

    Provide the rule details using the Create New Rule wizard.

    1. Select Specific Incidents where the Target Attribute has a value of Database Instance.

    2. In the Conditions for Actions region located on the Add Actions page, select Execute the actions on the conditions specified.

      Select How long the incident is open and in a particular state (select time and optional expressions)

      Select the Time to be 30 minutes and the Attribute Name to be Escalation Level with a value of 1. Click Continue.

    3. In the Basic Notification region, type the name of the administrator to be notified by e-mail or page.

    4. Repeat steps b and c to notify the DBA director when escalation level is 2 and the Operations VP when the escalation level is 3.

    5. Review the summary and save the rule.

    6. Click Next until you get to the Summary screen. Verify that the information is correct and click Save.

  5. Review the sequence of existing enterprise rules and position the newly created rule in the sequence.

    On the Edit Rule Set page, select Actions, then select Reorder Rules. Click Save to save the change to the sequence.

Example Scenario

In many companies, the operations team handles incidents at different escalation levels. An incident is escalated to a higher level based on how long the incident remains unresolved.

To facilitate this process, the administration manager creates a rule to escalate unresolved incidents based on their age:

  • To level 1 if the incident is open for 30 minutes

  • To level 2 if the incident is open for 1 hour

  • To level 3 if the incident is open for 90 minutes

As per the organization's policy, the DBA manager is notified for escalation level 1. Similarly, the DBA director and operations VP are paged for incidents escalated to levels "2" and "3" respectively.

Accordingly, the administration manager inputs the above logic and the respective Enterprise Manager administrator IDs in a separate rule to achieve the above notification requirement. Enterprise Manager administrator IDs represents the respective users with required target privileges and notification preferences (that is, e-mail addresses and schedule).

Creating a Rule to Escalate a Problem

Before you create a rule to escalate a problem, ensure the following prerequisites are met:

  • Incident rule has been setup to generate appropriate incidents as to when a critical issue occurs.

  • Administrator attaches incidents to the problem if they feel the underlying issue is the one being tracked by the problem.

Perform the following steps:

  1. Navigate to the Incident Rules page.

    From the Setup menu located at the top-right of the Enterprise Manager home page, select Incidents, then select Incident Rules.

  2. On the Incident Rules - All Enterprise Rules page, either create a new rule set (click Create Rule Set...) or edit an existing rule set (highlight the rule set and click Edit...). (Rules are created in the context of a rule set.)

  3. In the Rules section of the Edit Rule Set page, select Create... to create an enterprise rule to automate actions on the problem. Select Problem Rule on the Select Type of Rule to Create page. Click Continue.

  4. On the Create New Rule page, select Specific problems and add the following criteria:

    The Attribute Name is Incident Count, the Operator is Greater than or equals and the Values is 20. Click Next.

  5. In the Conditions for Actions region on the Add Actions page select Always execute the action. As the actions to take when the rule matches the condition:

    • In the Notifications region, send e-mail to the owner of the problem and to the Operations Manager.

    • In the Update Problem region, select Escalate and choose 1 as the appropriate level.

    Click Continue.

  6. Review the rules summary. Make corrections as needed. Click Save.

Example Scenario

In an organization, whenever an unresolved problem has more than 20 occurrences of associated incidents, the problem should be escalated to prioritize the resolution. Accordingly, a problem rule is created to observe the count of incidents attached to the problem and escalate the problem when the count reaches the limit.

The problem owner and the Operations manager are notified by way of e-mail.

Setting Up Automated Notification for Private Rule

A DBA has setup a backup job on the database that he is administering. As part of the job, the DBA has subscribed to e-mail notification for "completed" job status.

Before you create the rule, ensure the following prerequisites are met:

  • You have the privilege to create jobs.

  • You have created a database backup job.

Perform the following steps:

  1. Navigate to the Incident Rules page.

    From the Setup menu located at the top-right of the Enterprise Manager home page, select Incidents, then select Incident Rules.

  2. On the Incident Rules - All Enterprise Rules page, either edit an existing rule set (highlight the rule set and click Edit...) or create a new rule set.

    Note: The rule set must be defined as a Private rule set.

  3. In the Rules tab of the Edit Rule Set page, select Create... and select Event Rule. Click Continue.

  4. On the Select Events page, select Job Status Change as the Event Type. Select the job in which you are interested either by selecting a specific job or selecting a job by providing a pattern, for example, Backup Management.

    Add additional criteria by adding an attribute: Target Type as Database Instance.

  5. Add conditional actions: Event matches the following criteria (Severity is Informational) and E-mail Me for notifications.

  6. Review the rules summary. Make corrections as needed. Click Save.

  7. Create a database backup job and subscribe for e-mail notification when the job completes.

When the job completes, Enterprise Manager publishes the informational event for "Job Complete" state of the job. The newly created rule matches the rule and e-mail is sent out to the DBA.

The DBA receives the e-mail and clicks the link to access the details section in Enterprise Manager console for the event.

Creating a Rule to Receive Notification Regarding Events

To create a rule to receive notification on events, perform the following steps:

  1. Navigate to Incident Rules-All Enterprise Rules page.

    From the Enterprise Manager home page, select Setup located at the top-right of the page, select Incidents, then select Incident Rules.

  2. Edit an existing enterprise rule set.

    Highlight the rule set and click Edit...

  3. In the Rules section of the Edit Rules Set page:

    1. Click Create ...

    2. Select "Incoming events and updates to events"

    3. Click Continue.

    Select an event type, for example, Target Availability. Add Specific Target Availability Events, for example, Host and select the specific availability events in which you are interested. Additional Criteria can include Severity of Critical. Click Next.

  4. To be notified of the event, define additional actions using the Add Actions page. For the conditions under which the event occurs, select Always Execute the Actions. For the notifications, provide information in the Basic Notifications region.

  5. When you receive the e-mail regarding the event, click on the link to access the details section in Enterprise Manager console for the event.

Setting Up Escalations

In an organization, there are times where incidents need to be escalated. To escalate an incident to another person, perform the following steps:

  1. Navigate to the Incident Rules page.

    Click Setup located at the top-right of the page. Select Incidents, then select Incident Rules.

  2. On the Incident Rules - All Enterprise Rules page, highlight the rule set of which the rule is a member. Click Edit.

  3. On the Edit Rule Set page, scroll down to the Rules section and highlight the rule you want to edit. Click Edit...

  4. In the Edit Rule Set wizard, select the incidents to which you want this escalation to apply by selecting Specific Incidents and selecting Escalation level in the Attribute Name. Provide an escalation value. Click Next.

  5. In the Add Actions page, click +Add to add an action.

    In the Update Incident section, check Escalate to and choose the option to which to escalate the incident. For example, choose 1 in the associated list. Click Continue.

  6. Click Next to specify the name and description. Click Next again to access the Review page.

  7. Review the rules summary. Make corrections as needed. Click Next, then click Save.

Incident Rules - Advanced Tasks

You can perform the following advanced tasks using Incident Rules:

Setting Up a Rule to Send Different Notifications for Different Severity States of an Event

Before you perform this task, ensure the following prerequisite is met:

  • DBA has setup appropriate thresholds for the metric so that a critical metric alert is generated as expected.

Perform the following tasks:

  1. Navigate to the Incident Rules page.

    From the Setup menu located at the top-right of the Enterprise Manager home page, select Incidents, then select Incident Rules.

  2. On the Incident Rules - All Enterprise Rules page, highlight a rule set and click Edit.... (Rules are created in the context of a rule set. If there is no existing rule set to manage the newly added target, create a rule set.)

  3. In the Edit Rule Set page, locate the Rules section. From the Actions menu, select Add Rule for Event.

  4. Provide the rule details as follows:

    1. For Type, select Metric Alerts as the Type.

    2. In the Additional Criteria section, select Severity as the Attribute Name and Critical as the Value. Click Next.

    3. On the Add Actions page, click +Add. In the Notifications section, provide the contact information for the DBA to be paged. Click Continue until you reach the Edit Rule Set page.

    4. Highlight the rule again. For Event Type Specific Criteria, select Metric Alert as the Type.

    5. In the Additional Criteria section, select Severity as the Attribute Name and Warning as the Value. Click Next.

    6. On the Add Actions page, click +Add. In the Notifications section, provide the contact information for the DBA to be e-mailed. Click Continue until you reach the Edit Rule Set page.

    7. Click Next until you get to the Summary screen. Verify that the information is correct and click Save.

Example Scenario

The Administration Manager sets up a rule to page the specific DBA when a critical metric alert event occurs for a database in a production database group and to e-mail the DBA when a warning metric alert event occurs for the same targets. This task occurs when a new group of databases is deployed and DBAs request to create appropriate rules to manage such databases.

Creating a Rule to Create a Ticket for Incidents

According to the operations policy of an organization, all critical incidents from a production database should be tracked by way of Remedy tickets. An incident rule is created to invoke the Remedy ticket connector to generate a ticket when a critical incident occurs for the database. When such an incident occurs, the ticket is generated by the incident rule, the incident is associated with the ticket, and the operation is logged for future reference to the updates of the incident. While viewing the details of the incident, the DBA can view the ticket ID and, using the attached URL link, access the Remedy to get the details about the ticket.

Before you perform this task, ensure the following prerequisites are met:

  • Monitoring support has been set up.

  • Remedy ticket support has been setup.

Perform the following steps:

  1. Navigate to the Incident Rules page.

    From the Setup menu located at the top-right of the Enterprise Manager home page, select Incidents, then select Incident Rules.

  2. On the Incident Rules - All Enterprise Rules page, highlight a rule set and click Edit.... (Rules are created in the context of a rule set. If there is no existing rule set , create a rule set.)

  3. Select the appropriate rule that covers the incident conditions for which tickets should be generated and click Edit....

    1. Specify that a ticket should be generated for incidents covered by the rule.

    2. Specify the ticket template to be used.

  4. Repeat step 3 until all appropriate rules have been edited.

  5. Click Save.

Creating a Rule to Notify Different Administrators Based on the Event Type

As per operations policy for production databases, the alerts that relate to application issues should go to the application DBAs and the alerts that relate to system parameters should go to the system DBAs. Accordingly, the respective incidents will be assigned to the appropriate DBAs and they should be notified by way of e-mail.

Before you set up rules, ensure the following prerequisites are met:

  • DBA has setup appropriate thresholds for the metric so that critical metric alert is generated as expected.

  • Incident rule has been setup to create incident for all such events.

  • Respective notification setup is complete, for example, global SMTP gateway, e-mail address, and schedule for individual DBAs.

Perform the following steps:

  1. Navigate to the Incident Rules page.

    From the Setup menu located at the top-right of the Enterprise Manager home page, select Incidents, then select Incident Rules.

  2. On the Incident Rules - All Enterprise Rules page, highlight a rule set and click Edit.... (Rules are created in the context of a rule set. If there is no existing rule set , create a rule set.)

  3. Search the list of enterprise rules matching the events from the production database.

  4. Select the rule which creates the incidents for the metric alert events for the database. Click Edit.

  5. Enter the specific metrics that identify application issues, as condition to match the incidents.

  6. Enter the specific metrics, which identifies issues with system parameters, as condition to match the incidents.

  7. Type a summary message, for example: Assign the incident to Cindy (Enterprise Manager administrator handling the system parameter issues). For the action, select to e-mail her.

  8. Review the rules summary. Make corrections as needed. Click Save.

Creating Notification Subscription to Existing Enterprise Rules

A DBA is aware of an enterprise rule that will escalate incidents managed by him when not resolved in 2 hours. The DBA wants to be notified when the rule escalates the Incident. The DBA can subscribe to the Rule, which escalates the Incident and will be notified whenever the rule escalates the Incident.

Before you set up a notification subscription, ensure the following prerequisites are met:

  • There exists an open incident for a database.

  • There exists a rule that escalates High Priority Incidents for databases that have not been resolved in hours.

Perform the following steps:

  1. Navigate to the Incident Rules page.

    From the Setup menu located at the top-right of the Enterprise Manager home page, select Incidents, then select Incident Rules.

  2. On the Incident Rules - All Enterprise Rules page, edit the existing rule set Highlight the rule set and click Edit... (Rules are created in the context of a rule set. If there is no existing rule set , create a rule set.)

  3. In the Rules section of the Edit Rule Set page, highlight the rule you want to change and click Edit....

  4. Edit the rule associated with e-mail notification. Subscribe yourself to receive e-mail notifications.

  5. Review the rules summary. Make corrections as needed. Click Save.

As a result of the edit to the enterprise rule, when an incident stays unresolved for 2 hours, the rule marks it to escalation level 1. An e-mail is sent out to the DBA notifying him about the escalation of the incident.

The DBA receives the e-mail notification and views the details pertaining to the database down Incident. The DBA clicks on the e-mail link that takes him to the Incident details page after successful login to Enterprise Manager.

Manually Ensuring That There Are No Events That Should Be Incidents

Perform the following steps:

  1. Navigate to the Incident Rules page.

    From the Setup menu located at the top-right of the Enterprise Manager home page, select Incidents, then select Incident Rules.

  2. On the Incident Rules - All Enterprise Rules page, edit the existing rule set Highlight the rule set and click Edit... (Rules are created in the context of a rule set. If there is no existing rule set , create a rule set.)

  3. In the Rules section of the Edit Rule Set page, choose Events in the Create Rule For list. Click Go.

  4. Provide the following parameters:

    • Specific event type: select Metric Alerts.

    • Target type for the event: select Database Instance.

    • Further filtering of events based on event lifecyle conditions like severity, indicate that the severity should be Critical.

    • Action that should take place when such an event occurs, enter that a specific DBA should be paged.

    • When prompted if any other actions should be taken, answer yes.

    • When prompted for specific lifecycle condition, enter that the severity is Warning.

    • When prompted for what action to take when this event occurs, enter that a specific DBA should be e-mailed.

    • When asked whether another action should take place, enter that there are no other actions for this rule.

Example Scenario

During the initial phase of Enterprise Manager uptake, every day the DBA manager reviews the un-acknowledged events on the databases his team is responsible for and filters them to view only the ones which are not tracked by ticket or incident. He browses such events to ensure that none of them requires incidents to track the issue. If he feels that one such event requires an incident to track the issue, he creates an incident directly for this event. He also creates incident rules to create an incident when similar events occur in future.

If there are certain events he triages and feels nobody else has to follow-up on the event, he marks it as acknowledged. Enterprise Manager filters out events from the Incident Manager that have been acknowledged.