3
Job and Event Systems

This chapter describes the Oracle Enterprise Manager Job Scheduling System and Event Management System.

The Job Scheduling System enables the automation of standard and repetitive tasks. With the Job system, you can create and manage jobs, schedule runs of jobs, and view information about jobs. Jobs can be scheduled against a single target (database or other service) or multiple targets in a network. If the node or its Intelligent Agent is down, the job request is queued, and once the node can be contacted, the queued job is submitted to the agent.

The Event Management System monitors your network environment for specific event conditions, such as loss of service or lack of storage. When registering an event, you choose pre-defined tests to be run against managed services by Intelligent Agents, then you select the notification parameters for which you or other system administrators will be notified. For some events, a fixit job can be set to automatically run in response to the event to correct the problem.

This chapter describes the Job Scheduling System and the Event Management System:

Topic See Page

Job Scheduling System

3-2

Event Management System

3-7

Topic	See Page
Job Scheduling System	3-2
Event Management System	3-7

Job Scheduling System

The Job Scheduling System allows you to schedule and manage job tasks throughout the network, even remotely. Any job that an administrator can perform from the operating system command or with SQL can be sent from the Job Scheduling System - and can be performed on any remote system.

With the Job Scheduling System, you can perform asynchronous tasks on multiple databases and other services without having to maintain connections to all those services. In addition, jobs can run simultaneously on different nodes in the system.

The three tiers of Oracle Enterprise Manager - the console and its Job Scheduling pane, the Oracle Management Server, and agents residing on managed nodes - work in unison to schedule and execute the job.

From job scheduling to job completion, the following steps occur:

From the console Jobs pane, a job is submitted that is made up of one or more tasks.
The Oracle Management Server stores the information and checks if the target node is up or down. If the node or its agent is down, the Oracle Management Server queues the job.
Once the node can be contacted, the Oracle Management Server sends the job information to the Intelligent Agent residing on the managed node. (Jobs can be sent to multiple nodes concurrently.)
The agent executes the job on schedule.
The agent returns any related job messages back to the Oracle Management Server for display in the appropriate consoles based on administrator permissions. If the agent cannot get in touch with the Oracle Management Server, it queues the messages.

This section discusses the benefits of the Job Scheduling System.

Topic See Page

Pre-defined System Tasks

3-3

Job Scheduling

3-4

Lights-out Management

3-5

Cross-Platform Job Scripts

3-5

Job Progress

3-5

Job Notification and Filtering

3-5

Communication with the Intelligent Agent

3-6

Complex Jobs

3-6

Scalability

3-6

Security and Jobs

3-6

Topic	See Page
Pre-defined System Tasks	3-3
Job Scheduling	3-4
Lights-out Management	3-5
Cross-Platform Job Scripts	3-5
Job Progress	3-5
Job Notification and Filtering	3-5
Communication with the Intelligent Agent	3-6
Complex Jobs	3-6
Scalability	3-6
Security and Jobs	3-6

Pre-defined System Tasks

When scheduling a job, you construct it with one or more tasks. The Job Scheduling System includes a variety of pre-defined tasks from which to select, such as starting up or shutting down services, and running SQL scripts or operating system programs.

Figure 3-1 Selecting Tasks When Creating a Job

Job Scheduling

The Job Scheduling System is easy to use because the task of scheduling and managing jobs is centralized in the Oracle Management Server. A job needs only to be sent once, regardless of the number of destinations or the number of times it will run.

After a job is submitted, the Oracle Management Server sends the job information to the appropriate Intelligent Agents on the selected destinations. The agents are responsible for running the job on schedule and returning job status messages back to the Oracle Management Server which then alerts the appropriate console(s).

When a job is submitted to one or more destinations, it is possible that any one of those services may be down. If a service or its agent is down, the Oracle Management Server queues job requests that could not be delivered to the service. Once the service can be contacted, the Oracle Management Server submits the queued job to the agent.

If a job has been scheduled with an agent, and the connection between the agent and the Oracle Management Server goes down, the agent still executes the job on schedule. When the job is completed, and if the Oracle Management Server is back up, the agent notifies the Oracle Management Server, which then displays the status of the job on the console. If the Oracle Management Server cannot be contacted, the agent queues the status message until the server is available.

To schedule a job, you do not have to connect directly to the node on which the job will be run. You only need to submit the job from the console and specify the destinations on which it should run. The destinations can include nodes, databases (and other services), listeners, and groups of such destinations.

Lights-out Management

The Job Scheduling System allows you to automate repetitive and periodic tasks and problem correction. If a job needs to be run periodically, the agents reschedule the job without the need for additional intervention. Messages about a job's status are reported back to the console.

The Job Scheduling System can be used with the Event Management System to automate problem correction. When you register an event, you have the option of specifying a fixit job, which will automatically be run in response to an event to correct the problem.

Cross-Platform Job Scripts

Jobs are implemented as Tool Command Language (Tcl) scripts. Tcl is a platform-independent scripting language used to write both job and event scripts. For example, a job can be run against a UNIX and an NT machine at the same time, without changing a single byte of information in the job definition.

Job Progress

You can monitor the progress of a job by double-clicking on the job in the Active Jobs page of the Jobs pane. When you click on a job in the list, the Job Properties dialog box appears providing information about the job's activities and progress.

After a job is run, a list of tasks comprising the job and the time that each task completed or failed appears in the Progress tab of the Job Properties dialog box.

Job Notification and Filtering

Administrators can be notified in various ways of the status of jobs, such as by electronic mail or page, depending on the administrator's preferences. With the job scheduling system, you can set up notification procedures and choose which administrators to have notified of job completion or failure. You can also filter e-mail and pages sent to administrators according to a job's status.

Communication with the Intelligent Agent

Although a job is submitted from the console, the job scripts themselves reside on the Intelligent Agents residing on the managed nodes. Because the manner in which a job is implemented may depend on the platform, each agent keeps its own set of job scripts.

Complex Jobs

A complex job is a job comprised of more than one task. Tasks in a job can be set up in any order, and can be configured to depend on the success or failure of other tasks in the job. For instance, a task in a job can be configured to halt if the previous task in the job fails.

Scalability

The Job Scheduling System allows you to run jobs efficiently on multiple remote nodes. When you submit a job to run on a remote node, all the information needed to run the job is transferred to the agent servicing the node.

When the job is run, it is run by the agent on that node, minimizing network traffic between the remote node, the Oracle Management Server, and the console. The only communication between the agent and the Oracle Management Server is the initial transmission of the job and any subsequent messages about job status.

Because jobs are run independently by agents, you can submit any number of jobs on multiple nodes without affecting the console. For example, you can submit several jobs and then immediately administer something else without waiting for the agents to schedule the jobs.

Additionally, because there is an Intelligent Agent residing on each managed node, jobs can run on multiple nodes simultaneously. For example, you can submit a job, such as running a report, on multiple databases worldwide. The job is then run independently by each agent servicing each database. In this way, all jobs are performed by their respective agents at the same time.

Security and Jobs

When jobs are run on a managed service, your preferred credentials for that service (stored in the repository) are usually used for accessing that service; therefore, you can perform any task from the console that you could perform if you were logged directly into the service using those credentials.

Event Management System

The Event Management System can be used to automatically monitor managed targets for potential problems, such as loss of service or lack of storage. The administrator defines what to monitor by creating an "event", which is a potential problem occurrence for which the Intelligent Agent then monitors services. An event is made up of one or more tests.

When a test within an event returns true, the Event Management System notifies you or another administrator that you specify. In certain cases, Oracle Enterprise Manager can also run a pre-defined fixit job to run in response to an event to correct the problem.

The Event Management System includes the following processes:

Creation and registration of events by the selection of one or more tests
Notification of specified administrators
Correction of problem occurrences

The registering and monitoring of an event involves the following steps:

From the Events pane in the console, you register an event made up of one or more tests that you select from a list of pre-defined tests.

In registering the event, you select the managed target(s) in the network that you want to monitor. You then select one or more tests to make up the event, specifying threshold parameters for each test.
When you have completed these event specifications, you submit the event.
The event is passed to the Oracle Management Server, which stores the information and checks if the node is up or down. If the node or its agent is down, the Oracle Management Server queues the event.
If the node and its agent are up, the Oracle Management Server sends the event information to the Intelligent Agent on the managed node containing the destination service. (Events can be sent to multiple nodes concurrently.)
When an Intelligent Agent receives an event, it runs the event's tests against the target service or services at user-defined polling periods which continue until the event is cancelled (de-registered).
When one or more tests on a service return true, the agent alerts the Oracle Management Server. If the Oracle Management Server is not reachable, the agent queues the messages.
Once the Oracle Management Server is notified, the Event Management System notifies administrators who have the Notify permission for the event and are scheduled for notification. Administrators are notified by console alert if they have at least view permissions, and can also be notified by page or e-mail alert if specified.

Information about an event's status is viewable in the Event Viewer window, which is accessible when you double-click on a listed event on the Alert page or History page in the Events pane. In the Event Viewer page you can check the status of an event and share information about it with other administrators by recording and viewing comments in the Event Log.

The Event Management System contains the following features:

Topic See Page

Proactive Event Management

3-8

Scalability

3-8

Event Notification and Filtering

3-9

Event Log

3-9

Topic	See Page
Proactive Event Management	3-8
Scalability	3-8
Event Notification and Filtering	3-9
Event Log	3-9

Proactive Event Management

When registering an event, you have the option to specify a fixit job, which is run by the agent on the managed node in response to an event. Events and fixit jobs used together automate problem detection and correction. The proactive management of an event ensures that a problem is corrected before it noticeably impacts end-users.

Scalability

The Event Management System allows one person to monitor multiple databases and systems. For example, it would be difficult for one person to connect to 100 databases individually every day to check on each database's performance. However, using the Event Management System, one person can effectively have the databases monitored 24 hours a day - with minimal performance impact on the console - and can be alerted if a problem is detected. Because the monitoring is performed by Intelligent Agents independently of the console, multiple services can be monitored without slowing down other tasks.

The Event Management System also gives you the option of focusing on select systems and events. Rather than monitoring all services or a large number of services at once, you can choose to focus on select services.

Event Notification and Filtering

When an event occurs, administrators are notified by console alert (if they have at least view permissions) and can also specify to be alerted by e-mail or page. When registering an event, you specify which administrator(s) are to be notified. You can also filter e-mail and pages sent to administrators according to the event's level of severity.

When the threshold of any test in the event exceeds the level specified by the test's parameter values, all designated administrators are notified. If a test does not have parameters, the alert occurs when the test returns true.

Every alert has a severity level indicated by color. The colors are displayed by an event severity flag located next to the name of each event in the Alerts page of the Events pane and on the target object(s) in the Groups pane.

The severity levels indicated by color are:

Alert cleared (green)
Warning (yellow)
Critical (red)
Unknown/Node Down (grey)

An event can be set up to notify an administrator at either a Warning threshold or at a Critical threshold. Additionally, if the event condition changes from Warning to Critical, or vice versa, an updated notification is sent.

In the Events pane, you can manually move an event from the list of events in the Alerts page to the History page; but the listed event will move back to the Alerts page if a test exceeds a threshold again.

Event Log

With the Event Log page, located in the Event Viewer page, administrators can share information with other administrators about events and how they are being managed. The Event Log page allows comments to be entered on a selected event by administrators with modify permissions for the event.

The information displayed in the Event Log page includes any comments that have been entered for the event, the names of the administrators that entered the comments, and the time and date each comment was entered. The Event Management System itself also enters data in the Event Log page.

Unsolicited Error Detection

An unsolicited event is an event that is registered by a third-party application in an enterprise managed by Oracle Enterprise Manager. These events are not Oracle Enterprise Manager events, but can still be registered by third-party applications if the application is using the Oracle Enterprise Manager API (Application Program Interface). When an unsolicited event occurs, the Intelligent Agent on the node is notified and sends the message to the Oracle Management Server, which then notifies the Oracle Enterprise Manager client.

Note:

Fixit jobs cannot be specified for unsolicited events.

3 Job and Event Systems

Job Scheduling System

Pre-defined System Tasks

Figure 3-1 Selecting Tasks When Creating a Job

Job Scheduling

Lights-out Management

Cross-Platform Job Scripts

Job Progress

Job Notification and Filtering

Communication with the Intelligent Agent

Complex Jobs

Scalability

Security and Jobs

Event Management System

Proactive Event Management

Scalability

Event Notification and Filtering

Event Log

Unsolicited Error Detection

3
Job and Event Systems