Skip Headers
Oracle® Enterprise Manager Cloud Control Administrator's Guide
12c Release 1 (12.1.0.1)

Part Number E24473-09
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
PDF · Mobi · ePub

3 Using Incident Management

Incident management allows you to monitor and resolve service disruptions quickly and efficiently.

This chapter covers the following topics:

Monitoring and Managing Via Incidents

Enterprise Manager Cloud Control 12c greatly expands target monitoring and management capability beyond previous releases by letting you focus on what is important from a broader monitoring/management perspective rather than focusing on numerous discrete events that may be relevant to a particular situation.

Note:

Also available is the mobile application for managing the incidents and problems on the go. For more information, see Chapter 12, "Cloud Control Mobile"

What is an event?

An event is a a significant occurrence on a managed target that typically indicates something has occurred outside normal operating conditions. Examples of events include: database target down, performance threshold violation, change in application configuration files, or job failure. An event can also be raised to signal successful operations or a job successfully completed.

Previous versions of Enterprise Manger generated alerts for exception conditions (metric alerts). For Enterprise Manager 12c, metric alerts are a type of event , one of many different event types. This event model significantly raises the number of conditions in an IT infrastructure for which Enterprise Manager can detect and raise events across the different functional areas of Enterprise Manager (such as monitoring, compliance, or the job system). Events now provide a uniform way to indicate that something of interest has occurred in a datacenter managed by Enterprise Manager.

What is an incident?

An incident is a situation or issue you need to act on. Of all events raised within your managed environment, there is likely only a subset that you need to act on because they impact your business applications (such as a target down event). An incident is, therefore, a significant event or set of related significant events that need to be managed because they can potentially impact your business applications. To manage these significant subset of events, Enterprise Manager provides incident management features.

Managing incidents is carried out through Incident Manger, which provides you with a central location from which to view, manage, diagnose and resolve incidents as well as identify, resolve and eliminate the root cause of disruptions. See Incident Manager for more information.

When you create an incident, you identify the event(s) for which you want an incident to be created. An incident may consist of a single event, as might be the case when you are only interested in whether a single database down, or something more complex consisting of multiple events as might be the case when monitoring host resources where multiple events such as CPU utilization, memory utilization, swap space utilization events are raised to indicate that machine load is high.

Managing by incidents allows you to focus on the smaller set of important issues in your managed environment. Incident Manager provides a rich set of features to help manage incidents such as the ability to assign incident ownership, track incident resolution status, set incident priority, or set incident escalation level.

Events

By definition, an event is a significant occurrence within your IT infrastructure that Enterprise Manager can detect and subsequently notify interested parties or take action on. An event has very specific attributes that allow Enterprise Manager (and ultimately an Enterprise Manager administrator) to identify, categorize, and manage the event. All events have the following attributes

Event Types

Previous versions of Enterprise Manager let you monitor and manage by discrete signals and notified you by raising a metric alert as a result of threshold violations. For Enterprise Manager 12c, a metric alert is now one of several types of event conditions for which Enterprise Manager can monitor. These event conditions are called event types. As shown in the following list, the range of events types greatly expands Enterprise Manager's monitoring flexibility.

Incidents allow you to manage many discrete event types by providing an intuitive way to combine them into meaningful issues that you can act upon.

Event Severity

Another event attribute is severity. Just as previous versions of Enterprise Manager utilized metric alert severity levels, this concept has been extended to all event types. The following table shows the various event severity levels along with the associated icon.

Icon Severity Description
Surrounding text describes fatal_error_16x16.png. Fatal The monitored target is down (target down event). A Fatal severity is the highest level severity and only applies to the Target Availability event type.
Surrounding text describes error.png. Critical Immediate action is required in a particular area. The area is either not functional or indicative of imminent problems.
Surrounding text describes warning.png. Warning Attention is required in a particular area, but the area is still functional.
Surrounding text describes minor_warning_16x16.png. Advisory While the particular area does not require immediate attention, caution is recommended regarding the area's current state. This severity is used primarily for compliance standards.
Surrounding text describes info.png. Informational A specific condition has just occurred.

Incidents

You monitor and manage your Enterprise Manager environment via incidents and not discrete events (even though an incident can conceivably consist of a single event). Managing by incident means rather than managing discrete events for your system. You now manage an incident that may consist of one significant event (for example, a target down event) or combination of related events (for example, host CPU utilization, host memory utilization, and host swap utilization events when monitoring host capacity). Incidents add an intuitive layer of abstraction that allows you to manage your monitored systems more efficiently because there is a smaller set of more meaningful incidents to manage.

When an incident is created, Enterprise Manager makes available rich set of incident management workflow features that let you to manage and track the incident through its complete lifecycle. Incident management functions allow you to:

All incident management/tracking operations are carried out from incident Manager. Creation of incidents for events, assignment of incidents to administrators, setting priority, sending notifications and other actions can be automated using (incident) rules..

The following examples illustrate how incidents are constructed and how attributes map to various stages of the incident lifecycle.

Incident Composed of a Single Event

The simplest incident is composed of a single event. In the following example, you are concerned whenever any production target is down. You can create an incident for the target down event which is raised by Enterprise Manager if it detects the monitored target is down. Once the incident is created, you will have available all incident management functionality required to track and manage its resolution.

Figure 3-1 Incident with a Single Event

graphic illustrates an incident with a single event.
Description of "Figure 3-1 Incident with a Single Event"

The figure shows how both the incident and event attributes are used to help you manage the incident. From the figure, we see that the database DB1 has gone down and an event of Fatal severity has been raised. An incident is opened and the owner/administrator Scott is currently working to resolve the issue. The incident severity is currently Fatal as the incident inherits the worst severity of all the events within incident. In this case there is only one event associated with the incident so the severity is Fatal.

As an open incident, you can use Incident Manager to track its ownership, its resolution status, set the priority and, if necessary, add annotations to the incident to share information with others when working in a collaborative environment. In addition, you have direct access to pertinent information from MOS and links to other areas of Enterprise Manager that will help you resolve the database problem. By drilling down on an open incident, you can access this information and modify it accordingly, as shown in the following graphic.

incident manager target down

Incident Composed of Multiple Events

Situations of interest may involve more than a single event. It is an incident's ability to contain multiple events that allows you to monitor and manage complex and more meaningful issues. For example, if a monitored system is running out of space, separate multiple events such as tablespace full and filesystem full may be raised. Both, however, are related to running out of space. Another machine resource monitoring example might be the simultaneous raising of CPU utilization, memory utilization, and swap utilization events. Together, these events form an incident indicating extreme load is being placed on a monitored host. The following figure illustrates this example.

Figure 3-2 Incident with Multiple Events

graphic shows a incident with multiple events
Description of "Figure 3-2 Incident with Multiple Events"

The incident severity is Critical even though one of the events (Memory Utilization) is only at a Warning severity level. Incidents inherit the worst severity of all the events within incident. The incident summary indicates why this incident should be of interest, in this case, "Machine Load is high". This message is an intuitive indicator for all administrators looking at this incident. By default, the incident summary is pulled from the message of the event, however, this message can be changed by any administrator working on the incident.

Because administrators are interested in overall machine load, administrator Sam has created an incident for these two metric events because they are related—together these events represent a host overload situation. An administrator needs to take action because memory is filling up and consumed CPU resource is too high. In its current state, this condition will impact any applications running on the host.

Helpdesk Incident Resolution

If your IT process requires a helpdesk ticket be created to resolve incidents, then you can use the helpdesk connector to integrate the incident with a helpdesk ticket and have Enterprise Manager automatically open a ticket when the incident is created in addition to tracking the ticket ID, and status of the ticket. This provides administrators with a way to check the status of the ticket from within Incident Manager. Enterprise Manager also allows you to link out to a Web-based third-part console directly from the ticket so that you can launch the console in context directly from the ticket.

Incident Attributes

Every incident possesses attributes that provide information as identification, status for tracking, and ownership. The following table lists available incident attributes.

Incident Attribute Definition
Escalated Escalation Levels
  • None (Not escalated)

  • Level 1

  • Level 2

  • Level 3

  • Level 4

  • Level 5

Priority Priority Options
  • None

  • Low

  • Medium

  • High

  • Very High

  • Urgent

Status Incident Status
  • New

  • Work in Progress

  • Resolved

Comment Annotations added by an administrator to communicate analysis information or actions taken to resolve the incident.
Owner Administrator/user currently working on the incident..
Acknowledged Yes or No. Acknowledging an incident stops any repeat notifications for that incident. When an incident is acknowledged, it will be implicitly assigned to the user who acknowledged it. When a user assigns an incident to himself, it is considered 'acknowledged'. Once acknowledged, an incident cannot be unacknowledged .

Event Prioritization

When working in a large enterprise it is conceivable that when systems are under heavy load, an extraordinarily large number of incidents and events will be generated. All of these need to be processed in a timely and efficient manner in accordance with your business priorities. To have them processed sequentially can result in long waits before incidents can be resolved: High priority events/incidents need to be addressed before those of low priority.

In order to determine which event/incidents are high priority, Enterprise Manager uses a prioritization protocol based on two incident/event attributes: Lifecycle Status of the target and the Incident/Event Type. Lifecycle Status is a target property that specifies a target's operational status. You can set/view a target's Lifecycle Status from the UI (from a target's Target Setup menu, select Properties). You can set target Lifecycle Status properties across multiple targets simultaneously by using the Enterprise Manager Command Line Interface (EM CLI) set_target_propert_value verb.

A target's Lifecycle Status is set when it is added to Enterprise Manager for monitoring. At that time, you determine where in the prioritization hierarchy that target belongs—the highest level being "mission critical" and the lowest being "development."

Target Lifecycle Status

  • Mission Critical (highest priority)

  • Production

  • Stage

  • Test

  • Development (lowest priority)

Incident/Event Type

  • Availability (highest priority)

  • All events/incidents (Fatal severity)

  • All events/incidents (Warning and Critical severities)

  • All events/incidents (Informational) (lowest priority)

Incident Manager

Incident Manager provides, in one location, the ability to search, view, manage, and resolve incidents and problems impacting your environment. Use Incident Manager to perform the following tasks:

Figure 3-3 incident Manager

graphic shows the incident manager console.
Description of "Figure 3-3 incident Manager"

The advantages of using Incident Manager include:

Before Working with Incidents

In order to work with incidents, ensure all relevant Enterprise Manager administrator accounts have been granted the appropriate privileges to manage incidents and ensure that the notification system is properly configured to send notification.

Granting User Privileges for Events, Incidents and Problems

Users are granted privileges for events, incidents, and problems in the following situations:

For events, two privileges are defined:

For incidents, two privileges are defined:

For problems, two privileges are defined:

To administer privileges from the Enterprise Manager UI, from the Setup menu, select Security and then select Administrators.

Working with Incidents

Data centers follow operational practices that enable them to manage events and incidents by business priority and in a collaborative manner. Enterprise Manager provides the following features to enable this management and automation:

You can manage an incident by performing the following:

  1. In the All Open Incidents view, select the incident.

  2. In the resulting Details page, click the General tab, then click Manage. The Manage dialog displays.

    incident mange dialog

    You can then adjust the priority, escalate the incident, and assign it to a specific engineer.

Setting Up Views in Incident Manager

A view is a set of search criteria for filtering incidents and problems in the system. You can define views to help you gain quick access to the incidents and problems on which you need to focus. For example, you may define a view to display all the incidents associated with the production databases that you own.

By specifying preferences to view the following for each of the incidents in the list: incident severity, incident message, acknowledgement flag, date the incident triggered, administrator assigned to it, resolution status, priority, escalated flag, ID, and category, you can filter extraneous incidents. Once the view preference is saved, Enterprise Manager will display only the list of matching incidents.

You can then search the incidents for only the ones with specific attributes, such as priority 1. The view allows easy access to pertinent incident for daily triaging activity. Accordingly, you can save the search criteria as a filter named "All priority 1 incidents for my targets". The filter becomes available in the UI for immediate use and will be available anytime you log in to access the specific incidents.

Note:

The view you create is specific to your Enterprise Manager account and cannot currently be shared with other administrators.

Perform the following steps:

  1. Navigate to the Incident Manager page.

    From the Enterprise menu on the Enterprise Manager home page, select Monitoring, then select Incident Manager.

  2. In the Views region located on the left, click Search.

    1. In the Search region, search for Incidents using the Type list and select Incidents.

    2. In the Criteria region, choose all the criteria that are appropriate. To add fields to the criteria, click Add Fields... and select the appropriate fields.

    3. After you have provided the appropriate criteria, click Get Results.

    4. To view all the columns associated with this table, in the View menu, select Columns, then select Show All.

      Validate that the list of incidents match what you are looking for. If not, change the search criteria as needed.

    5. Click the Create View... button.

Responding and Working on a Simple Incident

Before you begin working on resolving an incident, ensure your Enterprise Manager account has been granted the appropriate privileges to manage incidents from your managed system.

  • Privileges on events are calculated based on the privilege on the underlying source objects. For example, the user will have VIEW privilege on an event if he can view the target for the event.

  • Privileges on an incident are calculated based on the privileges on participating events.

  • Similarly, problem privileges are calculated based on privileges on underlying incidents.

Perform the following steps:

  1. Navigate to Incident Manager.

    From the Enterprise menu on the Enterprise Manager home page, select Monitoring, then select Incident Manager.

  2. Use a view to filter the list of incidents. For example, the administrator should use My Open Incidents and Problems view to see incidents and problems assigned to him. You can then sort the list by priority which you talk about next..

    To view incidents assigned to you, click on the predefined view My Open Incidents and Problems.

    Work on the incident with the highest priority. Be aware that as you are working on an individual incident, new incidents might be coming in. Update the list of incidents by clicking the Refresh icon.

  3. To work on an incident, select the incident. In the General tab, click Acknowledge to acknowledge the incident and set yourself as owner.

  4. If the solution for the incident is unknown, use one or all of the following methods made available in the Incident page:

    • Use the Guided Resolution region and access any recommendations, diagnostic and resolution links available.

    • Check My Oracle Support Knowledge base for known solutions for the incident.

    • Study related incidents available through the Related Events and Incidents tab.

  5. Once the solution is known and can be resolved right away, resolve the incident by using tools provided by the system, if possible.

  6. In most cases, once the underlying cause has been fixed, the incident is cleared in the next evaluation cycle. However, in cases like log-based incidents, clear the incident.

Searching My Oracle Support Knowledge

To access My Oracle Support Knowledge base entries from within Incident Manager, perform the following steps:

  1. Navigate to Incident Manager.

    From the Enterprise menu on the Enterprise Manager home page, select Monitoring, then select Incident Manager.

  2. Select one of the standard views. Choose the appropriate incident or problem in the View table.

  3. In the resulting details region, click My Oracle Support Knowledge. Sign in to My Oracle Support.

  4. On the My Oracle Support page, click the Knowledge tab to browse the knowledge base.

    From this page, in addition to accessing formal Oracle documentation, you can also change the search string in to look for additional knowledge base entries.

Open Service Request

There are times when you may need assistance from Oracle Support to resolve a problem. To submit a service request (SR), perform the following steps:

  1. Navigate to Incident Manager.

    From the Enterprise menu on the Enterprise Manager home page, select Monitoring, then select Incident Manager.

  2. Use one of the views to find the problem or search for it or use one of your custom views. Select the appropriate problem from table.

  3. Click on the Support Workbench: Package Diagnostic link.

  4. Complete the workflow for opening an SR. Upon completing the workflow, a draft SR will have been created.

  5. Sign in to My Oracle Support if you are not already signed in.

  6. On the My Oracle Support page, click the Service Requests tab.

  7. Click Create SR button. Click Help to learn how to create a new SR.

Suppressing Incidents and Problems

There are times when it is convenient to hide an incident or problem from the list in the All Open Incidents page or the All Open Problems page. For example, you may want to suppress an incident while the incident is being actively worked on and you do not need to be notified.

To suppress an incident or problem:

  1. Navigate to Incident Manager.

    From the Enterprise menu on the Enterprise Manager home page, select Monitoring, then select Incident Manager.

  2. Select either the All Open Incidents view or the All Open Problems view. Choose the appropriate incident or problem. Click the General tab.

  3. In the resulting details region, click More, then select Suppress.

  4. On the resulting Suppress pop-up, choose the appropriate suppression type. Add a comment if desired.

  5. Click OK.

Incidents - Advanced Tasks

You can perform the following advanced tasks using Incident Manager:

Creating an Incident Manually

If an event of interest occurs that is not covered by any rule and you want to convert that event to an incident, perform the following:

  1. Using an available view, find the event of interest.

  2. Select the event in the table.

  3. From the More... drop-down menu, choose Create Incident...

  4. Enter the incident details and click OK.

  5. Should you decide to work on the incident, set yourself as owner of the incident and update status to Work in Progress.

Example Scenario

As per the operations policy, the DBA manager has setup rules to create incidents for all critical issues for his databases. The remainder of the issues are triaged at the event level by one of the DBAs.

One of the DBA receives e-mail for an "SQL Response" event (not associated with an incident) on the production database. He accesses the details of the event by clicking on the link in the e-mail. He reviews the details of the event. This is an issue that needs to be tracked and resolved, so he opens an incident to track the resolution of the issue. He marks the status of the incident as "Work in progress".

Managing Workload Distribution of Incidents

Incident Manager enables you to manage incidents and problems to be addressed by your team

Perform the following tasks:

  1. Navigate to Incident Manager.

    From the Enterprise menu on the Enterprise Manager home page, select Monitoring, then select Incident Manager.

  2. Use the standard or custom views to identify the incidents for which your team is responsible. You may want to focus on unassigned and unacknowledged incidents and problems.

  3. Review the list of incidents. This includes: determining person assigned to the incident, checking its status, progress made, and actions taken by the incident owner.

  4. Add comments, change priority, reassign the incident as needed by clicking on the Manage button in the Incident Details region.

Example Scenario

The DBA manager uses Incident Manager to view all the incidents owned by his team. He ensures all of them are correctly assigned; if not, he reassigns and prioritizes them appropriately. He monitors the escalated events for their status and progress, adds comments as needed for the owner of the incident. In the console, he can view how long each of the incidents has been open. He also reviews the list of unassigned incidents and assigns them appropriately.

Rule Sets

With previous versions of Enterprise Manager, you used notification rules to choose the individual targets and conditions for which you want to perform actions and/or receive notifications (send e-mail, page, open a trouble ticket) from Enterprise Manager. For Enterprise Manager 12c, the concept and function of notification rules has been replaced with rules and rule sets.

A rule set is a set of one or more rules that apply to a common set of objects such as targets (hosts, databases, groups), jobs, metric extensions, or self updates. A rule set allows you to logically combine different rules relating to the common set of objects (such as jobs, targets, applications) into a single manageable unit. Operationally, individual rules within a rule set are executed in a specified order as are the rule sets themselves. rule sets are executed in a specified order. By default, the execution order for both rules and rule sets is the order in which they are created, but can be reordered from the Incident Manager UI.

The following figure shows typical rule set structure and how the individual rules are applied to a heterogeneous group of targets.

Figure 3-4 Rule Set Application

Graphic shows the applications of an incident rule set.
Description of "Figure 3-4 Rule Set Application"

The graphic illustrates a situation where all rules pertaining to a group of targets can be put into a single rule set. In the above example, a group named PROD-GROUP consists of hosts, databases, and WebLogic servers exists as part of a company's managed environment. A single rule set is created to manage the group.

In addition to the actual rules contained within a rule set, a rule set possesses the following attributes:

Out-of-Box Rule Sets

Enterprise Manager provides out-of-box rule sets for incident creation, event deletion based on typical scenarios. The following rule sets are immediately available upon installation.

Incident Management Rule Sets for All Targets

  • Incident creation Rule for target down.

  • Incident creation Rule for agent unreachable (for Agents and hosts).

  • Incident creation Rule for metric alerts (for critical severity only).

  • Out-of-box Incident creation rule for Service Level Agreement Alerts.

  • Incident creation rule for compliance score violation

  • Incident creation rule for high-availability events.

  • Auto-clear Rule for metric alerts older than 7 days.

  • Auto clear Rule for job status change terminal status events older than 7 days.

  • Clear Application Dependency and Performance (ADP) alerts after without incidents after 7 days.

Event Management Rule Set for Self-Update

  • Notification Rule for new updates

Note:

Out-of-Box rule sets cannot be deleted. They can only be enabled or disabled.

Some examples of the types of actions that a rule can perform are:

  • Create an incident based on an event.

  • Perform notification actions such as sending an e-mail or generating a helpdesk ticket.

  • Perform actions to manage incident workflow notification via e-mail/PL/SQL methods/ SNMP traps. For example, if target down event occurs, create an incident and e-mail administrator Joe about the incident. If the incident is still open after two days, set the escalation level to one and e-mail Joe's manager.

Rule Set Types

There are two types of Rule Sets:

  • Enterprise: Used to implement all operational practices within your IT organization. All supported actions are available for this type of rule set. However, because this type of rule set can perform all actions, there are restrictions as to who can create an enterprise rule set.

    In order to create or edit an enterprise rule set, an administrator must have been granted the "Create Enterprise Rule Set " privilege on the "Enterprise Rule Set" resource. An enterprise rule set can have multiple authors, however, if the originator of the rule set wants other administrators to edit the rule set, he will need to share access in order to work collaboratively. Rule sets are visible to all administrators.

    Important: When a rule set performs actions, the privileges of the rule set creator are used.

  • Private: If an administrator does not have the Create Enterprise Rule Set resource privilege and consequently cannot create an enterprise rule set, but wants to be notified about something he is monitoring, he can create a private rule set. The only action a private rule set can perform is to send e-mail to the rule set owner. Any administrator can create a private rule set.

Rules

Rules are instructions within a rule set that automate actions on incoming events or incidents or problems (critical errors in Oracle software). Because rules operate on incoming incidents/events/problems, if you create a new rule, it will not act retroactively on incidents/events/problems that have already occurred.

Every rule is composed of two parts:

  • Criteria: The events/incidents/problems on which rule applies.

  • Action(s): The ordered set of one or more operations on the specified events, incidents, or problems. Each action can be executed based on additional conditions.

    Important: Rules are executed in a specified order. The rule execution order can be changed at any time. By default, rules are executed in the order they are created.

The following table shows how rule criteria and actions determine rule application. In this rule operation example there are three rules executed in order according to specified criteria.

Table 3-1 Rule Operation

Rule Name Execution Order Criteria Action



Condition Actions

Rule 1

First

CPU Util(%), Tablespace Used(%) metric alert events of warning or critical severity

 

Create incident.

Rule 2

Second

Incidents of warning or critical severity

If severity = critical

If severity =warning

Notify by page

Notify by e-mail

Rule 3

Third

Incidents open for more than 7 days

 

Set escalation level to 1


In the rule operation example, Rule 1 applies to two metric alert events: CPU Utilization and Tablespace Used. Whenever these events reach either Warning or Critical severity threshold levels, an incident is created.

When the incident severity level (the incident severity is inherited from the worst event severity) reaches Warning, Rule 2 is applied according to its first condition and Enterprise Manager sends an e-mail to the administrator. If the incident severity level reaches Critical, Rule 2's second condition is applied and Enterprise Manager sends a page to the administrator.

If the incident remains open for more than seven days, Rule 3 applies and the incident escalation level is increased from None to Level 1.

Rule Criteria

Rules are applied to events, incidents, and problems according to criteria selected at the time of rule creation (or update).There are three rule applications:

  • Incoming/updated events

  • Newly created/updated incidents

  • Newly created/updated problems

Available criteria varies depending on the rule application. The following tables list selectable criteria for each type.

Table 3-2 Rule Criteria: Events

Criteria Description

Type

Rule applies to a specific event type. The following event types are available:

  • Application Dependency and Performance Alert

  • Compliance Standard Rule Violation

  • Compliance Standard Score Violation

  • High Availability

  • JVM Diagnostics Threshold Violation

  • Job Status Change

  • Metric Alert

  • Metric Evaluation Error

  • Service Level Agreement Alert

  • Target Availability

  • User-Reported

Severity

Rule applies to a specific event severity. The following event severities are available:

  • Fatal

  • Critical

  • Warning

  • Advisory

  • Informational

  • Clear

Category

Rule applies to a specific event category. The following event categories are available:

  • Availability

  • Business

  • Capacity

  • Configuration

  • Diagnostics

  • Error

  • Faults

  • Jobs

  • Load

  • Performance

  • Security

Target type

Rule applies to a specific target type. The following target types are available:

  • Agent

  • Application Deployment

  • Beacon

  • CSA Collector

  • Database Instance

  • Database System

  • EM Service

  • Host

  • Infrastructure Cloud

  • Metadata Repository

  • OMS Console

  • OMS Repository

  • OMS Platform

  • OMS and Repository

  • Oracle Authorization Policy Manager

  • Oracle Fusion Middleware Farm

  • Oracle HTTP Server

  • Oracle Home

  • Oracle Management Service

  • Oracle Web Logic Domain

  • Oracle Web Logic Server

Associated with incident

Typically, events are associated with incidents through rules. Specify Yes or No.

Event name

Rule applies to events with a specific name. The specified name can either be an exact match or a pattern match.

Root cause analysis result

Upon completion of Root Cause Analysis (RCA) event, the rule applies to the event that is marked either as root cause or symptom. Alternatively, the rule can act on an RCA event when it is no longer a symptom.

Associated incident acknowledged

Rule applies to an event that is associated with a specific incident when that incident is acknowledged by an administrator. Specify Yes or No.

Total occurrence count

For duplicated events, the rule is applies when the total number of event occurrences reaches a specified number.

Comment added

Rule applies to events where an administrator adds a comment.


For incidents, a rule can apply to all new and/or updated incidents, or newly created incidents that match specific criteria shown in the following table.

Table 3-3 Rule Criteria: Incidents

Criteria Description

Rules that created the incidence

Rule applies to incidents raised by a specific rule.

Category

Rule applies to a specific incident category. The following incident categories are available:

  • Availability

  • Business

  • Capacity

  • Configuration

  • Diagnostics

  • Error

  • Faults

  • Jobs

  • Load

  • Performance

  • Security

Target Type

Rule applies to a specific target type. The following target types are available:

  • Agent

  • Application Deployment

  • Beacon

  • CSA Collector

  • Database Instance

  • Database System

  • EM Service

  • Host

  • Infrastructure Cloud

  • Metadata Repository

  • OMS Console

  • OMS Repository

  • OMS Platform

  • OMS and Repository

  • Oracle Authorization Policy Manager

  • Oracle Fusion Middleware Farm

  • Oracle HTTP Server

  • Oracle Home

  • Oracle Management Service

  • Oracle Web Logic Domain

  • Oracle Web Logic Server

Severity

Rule applies to a specific incident severity. The following incident severities are available:

  • Fatal

  • Critical

  • Warning

  • Advisory

  • Informational

  • Clear

Acknowledged

Rule applies if the incident has been acknowledged by an administrator. Specify Yes or No.

Owner

Rule applies for a specified incident owner.

Priority

Rule applies when incident priority matches a selected priority. Available priorities are:

  • Urgent

  • Very High

  • High

  • Medium

  • Low

  • None

Status

Rule applies when the incident status matches a selected incident status. Available statuses:

  • New

  • Work in Progress

  • Resolved

  • Closed

Escalation Level

Rule applies when the incident escalation level matches the selected level. Available escalation levels: None, Level 1, Level 2, Level 3, Level 4, Level 5

Associated with Ticket

Rule applies when the incident is associated with a helpdesk ticket. Specify Yes or No.

Associated with Service Request

Rule applies when the incident is associated with a service request. Specify Yes or No.

Diagnostic Incident

Rule applies when the incident is a diagnostic incident. Specify Yes or No.

Unassigned

Rule applies if the newly raised incident does not have an owner.

Comment Added

Rule applies if an administrator adds a comment to the incident.


For problems, a rule can apply to all new and/or updated problems, or newly created problems that match specific criteria shown in the following table.

Table 3-4 Rule Criteria: Problems

Criteria Description

Problem key

Each problem has a problem key, which is a text string that describes the problem. It includes an error code (such as ORA 600) and in some cases, one or more error parameters.

Rule can apply to a specific problem key or a key matching a specific pattern (using a wildcard character).

Category

Rule applies to a specific problem category. The following problem categories are available:

  • Availability

  • Business

  • Capacity

  • Configuration

  • Diagnostics

  • Error

  • Faults

  • Jobs

  • Load

  • Performance

  • Security

Target Type

Rule applies to a specific target type. The following target types are available:

  • Agent

  • Application Deployment

  • Beacon

  • CSA Collector

  • Database Instance

  • Database System

  • EM Service

  • Host

  • Infrastructure Cloud

  • Metadata Repository

  • OMS Console

  • OMS Platform

  • OMS and Repository

  • Oracle Authorization Policy Manager

  • Oracle Fusion Middleware Farm

  • Oracle HTTP Server

  • Oracle Home

  • Oracle Management Service

  • Oracle WebLogic Domain

  • Oracle WebLogic Server

Acknowledged

Rule applies when the problem is acknowledged.

Owner

Rule applies for a specified problem owner.

Priority

Rule applies when problem priority matches a selected priority. Available priorities are:

  • Urgent

  • Very High

  • High

  • Medium

  • Low

  • None

Status

Rule applies when the problems matches a specific status. The following statuses are available:

  • New

  • Work in Progress

  • Resolved

  • Closed

Escalation Level

Rule applies when the problem escalation level matches the selected level. Available escalation levels: None, Level 1, Level 2, Level 3, Level 4, Level 5

Incident Count

Rule applies when the number of incidents related to the problem reaches the specified count limit. The problem owner and the Operations manager are notified via e-mail.

Associated with Service Request

Rule applies if the incoming problem is has an associated Service Request. Specify Yes or No.

Associated with Bug

Rule applies if the incoming problem is has an associated bug. Specify Yes or No

Unassigned

Rule applies if the newly raised incident does not have an owner.

Comment Added

Rule applies if an administrator adds a comment to the problem.


Rule Actions

For each rule condition, Enterprise Manager allows you to define specific actions. The following table summarizes available actions for each to rule application.

Table 3-5 Available Rule Actions

Action Event Incident Problem

E-mail

Yes

Yes

Yes

Page

Yes

Yes

Yes

Advanced Notification Method

Yes

Yes

Yes

Create an Incident

Yes

No

No

Update Incident/Problem Attributes

No

Yes

Yes

Create a Helpdesk Ticket

Yes

Yes

Yes


Rule Set Guidelines

When creating rule sets, adhering to the following guideline will result in efficient use of system resource as well as operational efficiency.

  • For rule sets that operate on targets (for example, hosts and databases), use groups to consolidate all targets into a single target for the rule set.

  • Consolidate all rules that apply to the group members within the same rule set and make the group the target of the rule set.

  • Leverage the execution order of rules within the rule set.

When creating a new rule, you are given a choice as to what object the rule will apply— events, incidents or problems. Use the following rule usage guidelines to help guide your selection.

Table 3-6 Rule Usage Guidelines

Rule Usage Application

Rules on Events…

To create incidents for the alerts/events managed in Enterprise Manager

To create tickets for incidents managed by helpdesk analysts , you want to create an incident for an event, then create a ticket for the incident.

Send events to third party management systems

To send notifications on events (no incident created)

Rules on Incidents

Automate management of incident workflow operations (assign owner, set priority, escalation levels..) and send notifications

Create tickets based on incident conditions. For example, create a ticket if the incident is escalated to level 2.

Rules on Problems

Automate management of problem workflow operations (assign owner, set priority, escalation levels..) and send notifications


Rule Set Example

The following example illustrates many of the implementation guidelines just discussed. All targets have been consolidated into a single group, all rules that apply to group members are part of the same rule set, and the execution order of the rules has been set. In this example, the rule set applies to a group (Production Group G) that consists of the following targets:

  • DB1 (database)

  • Host1 (host)

  • WLS1 (WebLogic Server)

All rules in the rule set perform three types of actions: incident creation, notification, and escalation.

graphic shows an example rule set containing 3 rules.

In a more detailed view of the rule set, we can see how the guidelines have been followed.

graphic shows a detailed view of the rule set where actual rules have been added.

In this detailed view, there are five rules that apply to all group members. The execution sequence of the rules (rule 1 - rule 5) has been leveraged to correspond to the three types of rule actions in the rule set: Rules 1-3

  • Rules 1-3: Incident Creation

  • Rule 4: Notification

  • Rule 5: Escalation

By synchronizing rule execution order with the progression of rule action categories, maximum efficiency is achieved. As shown in this example, by consolidating all notifications in one rule, it is easier to make notification changes in the future when the notifications operations are defined in one place than in multiple places. Note: This assumes that the notification requirements for all the incidents (from rules 1 - 3) are the same.

The following table illustrates explicit rule set operation for this example.

Table 3-7 Example Rule Set for Production Group G

Rule Name Execution Order Criteria Action



Condition Actions

Rule Set: Targets within Production Group G

Rule 1

First

DB1 goes down .

Host1 goes down.

WLS1 goes down.

 

Create incident.

Rule 2

Second

DB1

Tablespace Full (%)

Warning=85, Critical=97

Host1

CPU Utilization (%): Warning=65, Critical=85

WLS1

Heap Usage (%)

Warning=80, Critical=90

If severity=Warning

If severity=Critical

Create incident.

Rule 3

Third

Event generated for problem job status changes for DB1, Host1, and WLS1.

 

Create incident.

Rule 4

Fourth

All incidents for Production Group G

Severity=Warning

Severity=Critical

Send e-mail

Send page

Rule 5

Fifth

Incident remains open for more than 12 days.

Status=Fatal

Increase escalation level to 1.


Before Using Rules

Before you use rules, ensure the following prerequisites have been set up:

Privileges Required for Enterprise Rule Sets

As the owner of the rule set, an administrator can perform the following:

If an incident or problem rule has an update action (for example, change priority), it will take the action only if the owner of the respective rule set has manage privilege on the matching incident or problem.

To acquire privileges, click Setup on the Enterprise Manager home page, select Security, then select Administrators to access the Administrators page. Select an administrator from the list, then click Edit to access the Administrator properties wizard as shown in the following graphic.

graphic shows the administrator edit wizard.

Working with Rules

You can perform the following tasks using Rules:

Creating an Rule

To create an rule, perform the following steps:

  1. From the Setup menu, select Incidents then select Rules.

  2. On the Incident Rules - All Enterprise Rules page, edit the existing rule set (highlight the rule set and click Edit...) or create a new rule set. Rules are created in the context of a rule set.

  3. In the Rules tab of the Edit Rule Set page, click Create... and select the type of rule to create (Event, Incident, Problem) on the Select Type of Rule to Create page. Click Continue.

  4. In the Create New Rule wizard, provide the required information.

  5. Once you have finished defining the rule, click Continue to add the rule to the rule set. Click Save to save the changes made to the rule set.

Creating a Rule to Create an Incident

To create a rule that creates an incident, perform the following steps:

  1. From the Setup menu, select Incidents, then select Rules.

  2. Determine whether there is an existing rule set that contains a rule that manages the event. In the Rules page, use the Search option to find the events for the target and the associated rule set.

    Note: In the case where there is no existing rule set, create a rule set by clicking Create Rule Set... You then create the rule as part of creating the rule set.

  3. Select the rule set that will contain the new rule. Click Edit... In the Rules tab of the Edit Rule Set page,

    1. Click Create ...

    2. Select "Incoming events and updates to events"

    3. Click Continue.

    Provide the rule details using the Create New Rule wizard.

    1. Select the Event Type the rule will apply to, for example, Metric Alert. (Metric Alert is available for rule sets of the type Targets.) You can then specify metric alerts by selecting Specific Metrics. The table for selecting metric alerts displays. Click the +Add button to launch the metric selector. On the Select Specific Metric Alert page, select the target type, for example, Database Instance. A list of relevant metrics display. Select the ones in which you are interested. Click OK.

      You also have the option to select the severity and corrective action status.

    2. Once you have provided the initial information, click Next. Click +Add to add the actions to occur when the event is triggered. One of the actions is to Create Incident.

      As part of creating an incident, you can assign the incident to a particular user, set the priority, and create a ticket. Once you have added all the conditional actions, click Continue.

    3. After you have provided all the information on the Add Actions page, click Next to specify the name and description for the rule. Once on the Review page, verify that all the information is correct. Click Back to make corrections; click Continue to return to the Edit (Create) Rule Set page.

    4. Click Save to ensure that the changes to the rule set and rules are saved to the database.

  4. Test the rule by generating a metric alert event on the metrics chosen in the previous steps.

Creating a Rule to Manage Escalation of Incidents

To create a rule to manage incident escalation, perform the following steps:

  1. From the Setup menu, select Incidents, then select Incident Rules.

  2. Determine whether there is an existing rule set that contains a rule that manages the incident. You can add it to any of your existing rule sets on incidents.

    Note: In the case where there is no existing rule set, create a rule set by clicking Create Rule Set... You then create the rule as part of creating the rule set.

  3. Select the rule set that will contain the new rule. Click Edit... in the Rules tab of the Edit Rule Set page, and then:

    1. Click Create ...

    2. Select "Newly created incidents or updates to incidents"

    3. Click Continue.

  4. For demonstration purposes, the escalation is in regards to a production database.

    As per the organization's policy, the DBA manager is notified for escalation level 1 incidents where a fatal incident is open for 48 hours. Similarly, the DBA director is paged if the incident has been escalated to level 2, the severity is fatal and it has been open for 72. The operations VP is paged for fatal incidents open for more than 72 hours when the incident has been escalated to level 3.

    Provide the rule details using the Create New Rule wizard.

    1. Select Specific Incidents where the rule applies to all newly created incidents or incidents where severity=fatal.

    2. In the Conditions for Actions region located on the Add Actions page, select Execute the actions on the conditions specified.

      Select How long the incident is open and in a particular state (select time and optional expressions)

      Select the Time to be 48 hours and the Attribute Name to be Escalation Level with a value of 1. Click Continue.

    3. In the Basic Notification region, type the name of the administrator to be notified by e-mail or page.

    4. Repeat steps b and c to page the DBA director (escalation level=2, severity=fatal, and open for 72 hours). Page Operations VP (escalation level=3, severity=fatal, and open for 72 hours).

      you have to specify different duration condition. We are also missing the action to set the escalation level.

    5. Review the summary and save the rule.

    6. Click Next until you get to the Summary screen. Verify that the information is correct and click Save.

  5. Review the sequence of existing enterprise rules and position the newly created rule in the sequence.

    On the Edit Rule Set page, select Actions, then select Reorder Rules. Click Save to save the change to the sequence.

Example Scenario

In many companies, the operations team handles incidents at different escalation levels. An incident is escalated to a higher level based on how long the incident remains unresolved.

To facilitate this process, the administration manager creates a rule to escalate unresolved incidents based on their age:

  • To level 1 if the incident is open for 30 minutes

  • To level 2 if the incident is open for 1 hour

  • To level 3 if the incident is open for 90 minutes

As per the organization's policy, the DBA manager is notified for escalation level 1. Similarly, the DBA director and operations VP are paged for incidents escalated to levels "2" and "3" respectively.

Accordingly, the administration manager inputs the above logic and the respective Enterprise Manager administrator IDs in a separate rule to achieve the above notification requirement. Enterprise Manager administrator IDs represents the respective users with required target privileges and notification preferences (that is, e-mail addresses and schedule).

Creating a Rule to Escalate a Problem

To create a rule to escalate a problem, perform the following steps:

  1. Navigate to the Incident Rules page.

    From the Setup menu, select Incidents, then select Incident Rules.

  2. On the Incident Rules - All Enterprise Rules page, either create a new rule set (click Create Rule Set...) or edit an existing rule set (highlight the rule set and click Edit...). (Rules are created in the context of a rule set.)

  3. In the Rules section of the Edit Rule Set page, select Create... to create an enterprise rule to automate actions on the problem. Select Problem Rule on the Select Type of Rule to Create page. Click Continue.

  4. On the Create New Rule page, select Specific problems and add the following criteria:

    The Attribute Name is Incident Count, the Operator is Greater than or equals and the Values is 20. Click Next.

  5. In the Conditions for Actions region on the Add Actions page select Always execute the action. As the actions to take when the rule matches the condition:

    • In the Notifications region, send e-mail to the owner of the problem and to the Operations Manager.

    • In the Update Problem region, enter the e-mail address of the appropriate administrator in the Assign to field.

    Click Continue.

  6. Review the rules summary. Make corrections as needed. Click Save.

Example Scenario

In an organization, whenever an unresolved problem has more than 20 occurrences of associated incidents, the problem should be auto-assigned to the appropriate administrator based on target type of the target on which the problem has been raised.

Accordingly, a problem rule is created to observe the count of incidents attached to the problem and notify the appropriate administrator handling that specific target type.

The problem owner and the Operations manager are notified by way of e-mail.

Setting Up Automated Notification for Private Rule

A DBA has setup a backup job on the database that he is administering. As part of the job, the DBA has subscribed to e-mail notification for "completed" job status. Before you create the rule, ensure that the DBA has the requisite privileges to create jobs.

Perform the following steps:

  1. Navigate to the Rules page.

    From the Setup menu, select Incidents, then select Rules.

  2. On the Incident Rules - All Enterprise Rules page, either edit an existing rule set (highlight the rule set and click Edit...) or create a new rule set.

    Note: The rule set must be defined as a Private rule set.

  3. In the Rules tab of the Edit Rule Set page, select Create... and select Event Rule. Click Continue.

  4. On the Select Events page, select Job Status Change as the Event Type. Select the job in which you are interested either by selecting a specific job or selecting a job by providing a pattern, for example, Backup Management.

    Add additional criteria by adding an attribute: Target Type as Database Instance.

  5. Add conditional actions: Event matches the following criteria (Severity is Informational) and E-mail Me for notifications.

  6. Review the rules summary. Make corrections as needed. Click Save.

  7. Create a database backup job and subscribe for e-mail notification when the job completes.

When the job completes, Enterprise Manager publishes the informational event for "Job Complete" state of the job. The newly created rule matches the rule and e-mail is sent out to the DBA.

The DBA receives the e-mail and clicks the link to access the details section in Enterprise Manager console for the event.

Creating a Rule to Receive Notification Regarding Incidents

To create a rule to receive notification on incidents, perform the following steps:

  1. Navigate to Incident Rules page.

    From the Setup menu, select Incidents, then select Incident Rules.

  2. Edit an existing enterprise rule set.

    Highlight the rule set and click Edit...

  3. In the Rules section of the Edit Rules Set page:

    1. Click Create ...

    2. Select "Newly created incidents or updates to incidents"

    3. Click Continue.

    Select the type of incidents to which the rule should apply (All new incidents and updated incidents, All new incidents, or Specific incidents) and click Next. .

  4. To be notified of the incident, define additional actions using the Add Actions page. For the conditions under which the incident occurs, select Always Execute the Actions. For the notifications, provide information in the Basic Notifications region.

  5. When you receive the e-mail regarding the incident, click on the link to access the details section in Enterprise Manager console for the incident.

Rules - Advanced Tasks

You can perform the following advanced tasks using Rules:

Setting Up a Rule to Send Different Notifications for Different Severity States of an Event

Before you perform this task, ensure the DBA has set appropriate thresholds for the metric so that a critical metric alert is generated as expected.

Consider the following example:

The Administration Manager sets up a rule to page the specific DBA when a critical metric alert event occurs for a database in a production database group and to e-mail the DBA when a warning metric alert event occurs for the same targets. This task occurs when a new group of databases is deployed and DBAs request to create appropriate rules to manage such databases.

Perform the following tasks to set appropriate thresholds:

  1. From the Setup menu, select Incidents, then select Incident Rules.

  2. On the Incident Rules - All Enterprise Rules page, highlight a rule set and click Edit.... (Rules are created in the context of a rule set. If there is no existing rule set to manage the newly added target, create a rule set.)

  3. In the Edit Rule Set page, locate the Rules section. From the Actions menu, select Add Event Rule.

  4. Provide the rule details as follows:

    1. For Type, select Metric Alerts as the Type.

    2. In the criteria section, select Severity. From the drop-down list, check and Critical and Warning as the selected values. Click Next.

    3. On the Add Actions page, click +Add.

      In the Create Incident section, check the Create Incident option. Click Continue. The Add Action page displays with the new rule. Click Next.

    4. Specify a name for the rule and a description. Click Next.

    5. On the Review page, ensure your settings are correct and click Continue. A message appears informing you that the rule has been successfully created. Click OK to dismiss the message.

      Next, you need to create a rule to perform the notification actions.

  5. From the Rules section on the Edit Rules page, click Create.

  6. Select Newly created incidents or updates to incidents as the rule type and click Continue.

  7. Check Specific Incidents.

  8. Check Severity and from the drop-down option selector, check Critical and Warning. Click Next.

  9. On the Add Actions page, click Add. The Conditional Actions page displays.

  10. In the Conditions for actions section, choose Only execute the actions if specified conditions match.

  11. From the Incident matches the following criteria list, choose Severity and then Critical from the drop-down option selector.

  12. In the Notifications section, enter the DBA in the Page field. Click Continue. The Add Actions page displays.

  13. Click Add to create a new action for the Warning severity.

  14. In the Conditions for actions section, choose Only execute the actions if specified conditions match.

  15. From the Incident matches the following criteria list, choose Severity and then Warning from the drop-down option selector.

  16. In the Notifications section, enter the DBA in the E-mail to field. Click Continue. The Add Actions page displays with the two conditional actions. Click Next.Specify a rule name and description. Click Next.On the Review page, ensure your rules have been defined correctly and click Continue. The Edit Rule Set page displays.

  17. Click Save to save your newly defined rules.

Creating a Rule to Create a Ticket for Incidents

According to the operations policy of an organization, all critical incidents from a production database should be tracked by way of Remedy tickets. A rule is created to invoke the Remedy ticket connector to generate a ticket when a critical incident occurs for the database. When such an incident occurs, the ticket is generated by the rule, the incident is associated with the ticket, and the operation is logged for future reference to the updates of the incident. While viewing the details of the incident, the DBA can view the ticket ID and, using the attached URL link, access the Remedy to get the details about the ticket.

Before you perform this task, ensure the following prerequisites are met:

  • Monitoring support has been set up.

  • Remedy ticketing connector has been configured.

Perform the following steps:

  1. From the Setup menu, select Incidents, then select Incident Rules.

  2. On the Incident Rules - All Enterprise Rules page, select the appropriate rule set and click Edit.... (Rules are created in the context of a rule set. If there is no applicable rule set , create a new rule set.)

  3. Select the appropriate rule that covers the incident conditions for which tickets should be generated and click Edit....

    1. Specify that a ticket should be generated for incidents covered by the rule.

    2. Specify the ticket template to be used.

  4. Repeat step 3 until all appropriate rules have been edited.

  5. Click Save.

Creating a Rule to Notify Different Administrators Based on the Event Type

As per operations policy for production databases, the incidents that relate to application issues should go to the application DBAs and the incidents that relate to system parameters should go to the system DBAs. Accordingly, the respective incidents will be assigned to the appropriate DBAs and they should be notified by way of e-mail.

Before you set up rules, ensure the following prerequisites are met:

  • DBA has setup appropriate thresholds for the metric so that critical metric alert is generated as expected.

  • Rule has been setup to create incident for all such events.

  • Respective notification setup is complete, for example, global SMTP gateway, e-mail address, and schedule for individual DBAs.

Perform the following steps:

  1. Navigate to the Incident Rules page.

    From the Setup menu, select Incidents, then select Incident Rules.

  2. On the Incident Rules - All Enterprise Rules page, highlight a rule set and click Edit.... (Rules are created in the context of a rule set. If there is no existing rule set , create a rule set.)

  3. Search the list of enterprise rules matching the events from the production database.

  4. Select the rule which creates the incidents for the metric alert events for the database. Click Edit.

  5. Enter the specific metrics that identify application issues, as condition to match the incidents.

  6. Enter the specific metrics, which identifies issues with system parameters, as condition to match the incidents.

  7. Type a summary message, for example: Assign the incident to Cindy (Enterprise Manager administrator handling the system parameter issues). For the action, select to e-mail her.

  8. Review the rules summary and make corrections as needed. Click Save.

Creating Notification Subscription to Existing Enterprise Rules

A DBA is aware that incidents owned by him will be escalated when not resolved in 48 hours. The DBA wants to be notified when the rule escalates the Incident. The DBA can subscribe to the Rule, which escalates the Incident and will be notified whenever the rule escalates the Incident.

Before you set up a notification subscription, ensure the following prerequisites are met:

  • There exists an open incident for a database.

  • There exists a rule that escalates High Priority Incidents for databases that have not been resolved in 48 hours.

Perform the following steps:

  1. From the Setup menu, select Incidents, then select Incident Rules.

  2. On the Incident Rules - All Enterprise Rules page, click on the rule set containing incident escalation rule in question and click Edit... (Rules are created in the context of a rule set. If there is no existing rule set , create a rule set.)

  3. In the Rules section of the Edit Rule Set page, highlight the escalation rule and click Edit....

  4. Navigate to the Add Actions page.

  5. Select the action that escalates the incident and click Edit...

  6. In the Notifications section, add the DBA to the E-mail cc list.

  7. Click Continue and then navigate back to the Edit Rule Set page and click Save.

As a result of the edit to the enterprise rule, when an incident stays unresolved for 48 hours, the rule marks it to escalation level 1. An e-mail is sent out to the DBA notifying him about the escalation of the incident.

Manually Ensuring That There Are No Events That Should Be Incidents

Oracle recommends managing via incidents in order to focus on important events or groups of related events. Due to the variety and sheer number of events that can be generated, it is possible that not all important events will be covered by incidents. To help you find these important yet untreated events, Enterprise Manager provides the Events without incidents predefined view.

Perform the following steps:

  1. From the Setup menu, select Incidents, then select Incident Rules.

  2. In the Views region, click Events without incidents.

  3. Select the desired event in the table. The event details display.

  4. In the details area, choose More and then either Create Incident or Add Event to Incident.

Example Scenario

During the initial phase of Enterprise Manager uptake, every day the DBA manager reviews the un-acknowledged events on the databases his team is responsible for and filters them to view only the ones which are not tracked by ticket or incident. He browses such events to ensure that none of them requires incidents to track the issue. If he feels that one such event requires an incident to track the issue, he creates an incident directly for this event.

If there are certain events he triages and feels nobody else has to follow-up on the event, he marks it as acknowledged. Enterprise Manager filters out events from the Incident Manager that have been acknowledged.

Problems

For Enterprise Manager 12c, problems focus on the diagnostic incidents and problem diagnostic incidents/problems generated by Advanced Diagnostic Repository (ADR), which are automatically raised by Oracle software when it encounters critical errors in the software. A problem, therefore, represents the root cause of all the Oracle software incidents. For these diagnostic incidents, in order to address root cause, a problem object in Enterprise Manager is created that represents the root cause of these diagnostic incidents. A problem is identified by a problem key which uniquely identifies the particular error in software. Each occurrence of this error results in a diagnostic incident which is then associated with the problem object.

When a problem is raised for Oracle software, Oracle has determined that the recommended recourse is to open a service request (SR), send support the diagnostic logs, and eventually provide a solution from Oracle. As an incident, Enterprise Manager makes available all tracking, diagnostic, and reporting functions for problem management. Whenever you view all open incidents and problems, whether you are using Incident Manager, or in context of a target/group home page, you can easily determine what issues are actually affecting your monitored target.

To manage problems, you should use Support Workbench to open the SR. Access to Support Workbench functionality is available through Incident Manager (Guided Resolution area) in context of the problem.

The following figure shows the tracking and diagnostic functionality available for problems from Incident Manger.

Figure 3-5 Viewing Problems from Incident Manager

graphic shows incident manager with callouts for problem display.
Description of "Figure 3-5 Viewing Problems from Incident Manager"

Moving from Enterprise Manager 10/11g to 12c

Enterprise Manager 12c incident management functionality leverages your existing pre-12c monitoring setup out-of-box. Migration is seamless and transparent. For example, if your Enterprise Manager 10/11g monitoring system sends you e-mails based on specific monitoring conditions, you will continue to receive those e-mails without interruption. To take advantage of 12c features, however, you may need to perform additional migration tasks.

Important:

Alerts that were generated pre-12c will still be available.

Rules

When you migrate to Enterprise Manger 12c, all of your existing notification rules are automatically converted to rules. Technically, they are converted to event rules first with incidents automatically being created for each event rule.

In general, event rules allow you to define which events should become incidents. However, they also allow you to take advantage of the Enterprise Manager's increased monitoring flexibility.

For more information on rule migration, see the following documents:

Privilege Requirements

The Create Enterprise Rule Set resource privilege is now required in order to edit/create enterprise rule sets and rules contained within. The exception to this is migrated notification rules. When pre-12c notification rules are migrated to event rules, the original notification rule owners will still be able to edit their own rules without having been granted the Create Enterprise Rule Set resource privilege. However, they must be granted the Create Enterprise Rule Set resource privilege if they wish to create new rules. Enterprise Manager Super Administrators, by default, can edit and create rule sets.