Sun N1 Grid Engine 6.1 User's Guide

Chapter 1 Introduction to the N1TM Grid Engine 6.1 Software

This chapter provides background information about the system of networked computer hosts that run the N1 Grid Engine 6.1 software (also referred to as the grid engine system). This chapter includes the following topics:

You can also find a good overview of grid computing and the N1 Grid Engine product on the YouTube web site: Introduction to Grid Engine.

What Is Grid Computing?

A grid is a collection of computing resources that perform tasks. In its simplest form, a grid appears to users as a large system that provides a single point of access to powerful distributed resources. In its more complex form, which is explained later in this section, a grid can provide many access points to users. In all cases, users treat the grid as a single computational resource. Resource management software such as N1 Grid Engine 6.1 software (grid engine software) accepts jobs submitted by users. The software uses resource management policies to schedule jobs to be run on appropriate systems in the grid. Users can submit millions of jobs at a time without being concerned about where the jobs run.

No two grids are alike. One size does not fit all situations. The three key classes of grids, which scale from single systems to supercomputer-class compute farms that use thousands of processors, are as follows:

Figure 1–1 shows the three classes of grids. In the cluster grid, a user's job is handled by only one of the systems within the cluster. However, the user's cluster grid might be part of the more complex campus grid, and the campus grid might be part of the largest global grid. In such cases, the user's job can be handled by any member execution host that is located anywhere in the world.

Figure 1–1 Three Classes of Grids

This picture shows examples of cluster, campus,
and Global grids

N1 Grid Engine 6.1 software provides the power and flexibility required for campus grids. The product is useful for existing cluster grids because it facilitates a smooth transition to creating a campus grid. The grid engine system effects this transition by consolidating all existing cluster grids on the campus. In addition, the grid engine system is a good start for an enterprise campus that is moving to the grid computing model for the first time.

The grid engine software orchestrates the delivery of computational power that is based on enterprise resource policies set by the organization's technical and management staff. The grid engine system uses these policies to examine the available computational resources within the campus grid. The system gathers these resources and then allocates and delivers resources automatically, optimizing usage across the campus grid.

To enable cooperation within the campus grid, project owners who use the grid must do the following:

The grid engine software can mediate among the entitlements of many departments and projects that are competing for computational resources.

Managing Workload by Managing Resources and Policies

The grid engine system is an advanced resource management tool for heterogeneous distributed computing environments. Workload management means that the use of shared resources is controlled to best achieve an enterprise's goals such as productivity, timeliness, level-of-service, and so forth. Workload management is accomplished through managing resources and administering policies. Sites configure the system to maximize usage and throughput, while the system supports varying levels of timeliness and importance . Job deadlines are instances of timeliness. Job priority and user share are instances of importance.

The grid engine software provides advanced resource management and policy administration for UNIX environments that are composed of multiple shared resources. The grid engine system is superior to standard load management tools with respect to the following major capabilities:

The grid engine software provides users with the means to submit computationally demanding tasks to the grid for transparent distribution of the associated workload. Users can submit batch jobs, interactive jobs, and parallel jobs to the grid. For the administrator, the software provides comprehensive tools for monitoring and controlling jobs.

The product also supports checkpointing programs. Checkpointing jobs migrate from workstation to workstation without user intervention on load demand.

How the System Operates

The grid engine system does all of the following:

Matching Resources to Requests

As an analogy, imagine a large “money-center” bank in one of the world's capital cities. In the bank's lobby are dozens of customers waiting to be served. Each customer has different requirements. One customer wants to withdraw a small amount of money from his account. Arriving just after him is another customer, who has an appointment with one of the bank's investment specialists. She wants advice before she undertakes a complicated venture. Another customer in front of the first two customers wants to apply for a large loan, as do the eight customers in front of her.

Different customers with different needs require different types of service and different levels of service from the bank. Perhaps the bank on this particular day has many employees who can handle the one customer's simple withdrawal of money from his account. But at the same time the bank has only one or two loan officers available to help the many loan applicants. On another day, the situation might be reversed.

The effect is that customers must wait for service unnecessarily. Many of the customers could receive immediate service if only their needs were immediately recognized and then matched to available resources.

If the grid engine system were the bank manager, the service would be organized differently.

Jobs and Queues

In a grid engine system, jobs correspond to bank customers. Jobs wait in a computer holding area instead of a lobby. Queues, which provide services for jobs, correspond to bank employees. As in the case of bank customers, the requirements of each job, such as available memory, execution speed, available software licenses, and similar needs, can be very different. Only certain queues might be able to provide the corresponding service.

To continue the analogy, the grid engine software arbitrates available resources and job requirements in the following way:

Usage Policies

The administrator of a cluster can define high-level usage policies that are customized according to whatever is appropriate for the site. Four usage policies are available:

Policy management automatically controls the use of shared resources in the cluster to best achieve the goals of the administration. High-priority jobs are dispatched preferentially. Such jobs receive higher CPU entitlements if the jobs compete for resources with other jobs. The grid engine software monitors the progress of all jobs and adjusts their relative priorities correspondingly and with respect to the goals defined in the policies.

Using Tickets to Administer Policies

The functional, share-based, and override policies are defined through a grid engine system concept that is called tickets. You might compare tickets to shares of a public company's stock. The more stock shares that you own, the more important you are to the company. If shareholder A owns twice as many shares as shareholder B, A also has twice the votes of B. Therefore shareholder A is twice as important to the company. Similarly, the more tickets that a job has, the more important the job is. If job A has twice the tickets of job B, job A is entitled to twice the resource usage of job B.

Jobs can retrieve tickets from the functional, share-based, and override policies. The total number of tickets, as well as the number retrieved from each ticket policy, often changes over time.

The administrator controls the number of tickets that are allocated to each ticket policy in total. Just as ticket allocation does for jobs, this allocation determines the relative importance of the ticket policies among each other. Through the ticket pool that is assigned to particular ticket policies, the administration can run a grid engine system in different ways. For example, the system can run in a share-based mode only. Or the system can run in a combination of modes, for example, 90% share-based and 10% functional.

Using the Urgency Policy to Assign Job Priority

The urgency policy can be used in combination with two other job priority specifications:

A job can be assigned an urgency value, which is derived from three sources:

The administrator can separately weight the importance of each of these sources in order to arrive at a job's overall urgency value. For more information, see Chapter 5, Managing Policies and the Scheduler, in Sun N1 Grid Engine 6.1 Administration Guide.

Figure 1–2 shows the correlation among policies.

Figure 1–2 Correlation Among Policies in a Grid Engine System

This graphic shows the Correlation Among Policies
in a Grid Engine System

Grid Engine System Components

The following sections explain the functions of the most important grid engine system components.

Hosts

Four types of hosts are fundamental to the grid engine system:

Master Host

The master host is central to the overall cluster activity. The master host runs the master daemon sge_qmaster and the scheduler daemon sge_schedd. Both daemons control all grid engine system components, such as queues and jobs. The daemons maintain tables about the status of the components, user access permissions, and the like.

By default, the master host is also an administration host and a submit host.

Execution Hosts

Execution hosts are systems that have permission to execute jobs. Therefore execution hosts have queue instances attached to them. Execution hosts run the execution daemon sge_execd.

Administration Hosts

Administration hosts are hosts that have permission to carry out any kind of administrative activity for the grid engine system.

Submit Hosts

Submit hosts enable users to submit and control batch jobs only. In particular, a user who is logged in to a submit host can submit jobs with the qsub command, can monitor the job status with the qstat command, and can use the grid engine system OSF/1 Motif graphical user interface QMON, which is described in QMON, the Grid Engine System's Graphical User Interface.


Note –

A system can act as more than one type of host.


Daemons

Three daemons provide the functionality of the grid engine system.

sge_qmaster – The Master Daemon

The center of the cluster's management and scheduling activities, sge_qmaster maintains tables about hosts, queues, jobs, system load, and user permissions. sge_qmaster receives scheduling decisions from sge_schedd and requests actions from sge_execd on the appropriate execution hosts.

sge_schedd – The Scheduler Daemon

The scheduling daemon maintains an up-to-date view of the cluster's status with the help of sge_qmaster. The scheduling daemon makes the following scheduling decisions:

The daemon then forwards these decisions to sge_qmaster, which initiates the required actions.

sge_execd – The Execution Daemon

The execution daemon is responsible for the queue instances on its host and for the running of jobs in these queue instances. Periodically, the execution daemon forwards information such as job status or load on its host to sge_qmaster.

Queues

A queue is a container for a class of jobs that are allowed to run on one or more hosts concurrently. A queue determines certain job attributes, for example, whether the job can be migrated. Throughout its lifetime, a running job is associated with its queue. Association with a queue affects some of the things that can happen to a job. For example, if a queue is suspended, all jobs associated with that queue are also suspended.

Jobs need not be submitted directly to a queue. You need to specify only the requirement profile of the job. A profile might include requirements such as memory, operating system, available software, and so forth. The grid engine software automatically dispatches the job to a suitable queue and a suitable host with a light execution load. If you submit a job to a specified queue, the job is bound to this queue. As a result, the grid engine system daemons are unable to select a better-suited device or a device that has a lighter load.

A queue can reside on a single host, or a queue can extend across multiple hosts. For this reason, grid engine system queues are also referred to as cluster queues. Cluster queues enable users and administrators to work with a cluster of execution hosts by means of a single queue configuration. Each host that is attached to a cluster queue receives its own queue instance from the cluster queue.

Client Commands

The command-line user interface is a set of ancillary programs (commands) that enable you to do the following tasks:

The grid engine system provides the following set of ancillary programs.

QMON, the Grid Engine System's Graphical User Interface

You can use QMON, the graphical user interface (GUI) tool, to accomplish most grid engine system tasks. Figure 1–3 shows the QMON Main Control window, which is often the starting point for both user and administrator functions. Each icon on the Main Control window is a GUI button that you click to start a variety of tasks. To see a button's name, which also describes its function, rest the pointer over the button.

Figure 1–3 QMON Main Control Window, Defined

This graphic shows the qmon main control ui with
callouts