Sun N1 System Manager 1.3 Grid Engine Provisioning and Monitoring Guide

Chapter 2 Provisioning N1 Grid Engine Onto Managed Servers

This chapter describes how you use the N1SM CLI commands to provision (install) and manage N1 Grid Engine . Using the N1SM CLI you can perform the following tasks:

Most of the functionality provided by the Sun Control Station Grid Engine Management Module (GEMM) is replicated by N1SM CLI commands. These functions include:

Using N1SM to Install N1 Grid Engine

The general task flow to install N1GE onto a managed server is the following:

  1. Use the or use some other media as the source to copy the N1GE software onto the N1SM management server as shown in the To Download the N1 Grid Engine Software Onto a Management Server section.

  2. Create an N1GE version using these files. See the To Create an N1GE Version section.

  3. Create an N1GE application profile associated with the version as shown in the To Create an N1GE Application Profile section.

  4. Load the application profile onto the managed servers while defining each server's N1GE role as shown in the Installing N1 Grid Engine onto Servers section.

Creating and Managing N1 Grid Engine Versions

This section describes the various commands that you use to do the following tasks:

What is an N1GE Version?

The term N1GE version specifically means the combination of an N1GE OS-specific tar.gz file and the n1ge-6_0u4–common.tar.gz file. For example, you would want to have separate N1GE versions for Solaris, Linux, and MS-Windows specific servers. The following table lists some of the OS-specific N1GE versions available from the Sun Download Center (SDLC). The versions in this table are the ones supported by N1SM. N1GE versions for other operating systems are also available and may function with N1SM but they are not officially supported.

N1GE Platform-Specific File

Platform

Solaris_sparc/tar/n1ge-6_0u4-bin-solaris.tar.gz

Solaris (SPARC platform) 32-bit binaries for Solaris 7, Solaris 8, and Solaris 9 Operating Systems. Note that N1SM does not support Solaris 7 and Solaris 8. 

Solaris_sparc/tar/n1ge-6_0u4-bin-solaris-sparcv9.tar.gz

Solaris (SPARC platform) 64-bit binaries for Solaris 7, Solaris 8, and Solaris 9 Operating Systems. Note that N1SM does not support Solaris 7 and Solaris 8. 

Solaris_x86/tar/n1ge-6_0u4-bin-solaris-i586.tar.gz

Solaris (x86 platform) binaries for Solaris 8, and Solaris 9 Operating Systems 

Solaris_x64/tar/n1ge-6_0u4-bin-solaris-x64.tar.gz

Solaris (x64 platform) 64-bit binaries for Solaris 10 

Windows/tar/n1ge-6_0u4-bin-win32-x86.tar.gz

Microsoft Windows (x86 platform 32-bit binaries for Windows 2000, XP and Windows Server 2003. Note that N1SM does not support Windows 2000. 

Linux24_i586/tar/n1ge-6_0u4-bin-linux24-i586.tar.gz

Linux (x86 platform) binaries for the 2.4 kernel 

Linux24_amd64/tar/n1ge-6_0u4-bin-linux24-amd64.tar.gz

Linux (AMD platform) binaries for the 2.4 kernel 

You must copy an OS-specific N1GE tar file for each OS you plan to support as well as the n1ge-6_0u4–common.tar.gz file.


Note –

The N1GE module can only use file in the tar.gz format.


ProcedureTo Download the N1 Grid Engine Software Onto a Management Server

Before you can create an N1GE version, you must make the N1GE application files accessible to the management server that you will use provision the version out to managed servers. The previous table lists the available tar files.

Step

    Copy the desired tar files from the source location onto your N1SM management server.

ProcedureTo Create an N1GE Version

Before You Begin

Once you have the N1GE tar files available on your management server, you can use them to create an N1SM N1GE version.

Steps
  1. Access the N1SM CLI (see Accessing the N1SM CLI).

  2. You create versions of N1 Grid Engine software using the create application command. The syntax for this command is:


    create application application file [file, file...] type GridEngine
    application

    A unique name for the N1GE version. For example, N1GE6_U4

    file

    A fully qualified path to the N1GE file to be copied. You can specify *.tar.gz installation files for the N1GE application, and each N1GE application requires the n1ge-6_0u4–common.tar.gz file.

    type

    The type of application; in this case, GridEngine.


    Note –

    Unlike the behavior for OS profiles, a default application profile is not automatically created when you copy an N1GE to the N1 System Manager. You must create this profile yourself using the create applicationprofile command.



Example 2–1 Create an N1 Grid Engine Version

If your grid consists of Solaris 9 SPARC hosts , then you must include in the version these files:


N1-ok>create application N1GE6_U4 file Solaris_sparc/tar/n1ge-6_0u4-bin-solaris.tar.gz,n1ge-6_0u4–n1ge-6_0u4–common.tar.gz type GridEngine

ProcedureTo View Available N1GE Versions

You use the show application command to list all the available N1GE versions or detailed information about a specific version like the file list.

Steps
  1. Access the N1SM CLI (see Accessing the N1SM CLI).

  2. To list all the available N1GE versions use this command:


    show application all type GridEngine
  3. To list detailed information about a specific N1GE version use this command:


    show application application type GridEngine
    all

    List all the available N1GE versions.

    application

    The name of an N1GE versions.

    type

    The type of application; in this case, GridEngine.


Example 2–2 Show N1GE Versions


N1-ok>show application all type GridEngine

N1-ok>show application N1GE6_U4 type GridEngine

ProcedureTo Delete an N1GE Version

Before You Begin

You cannot delete an N1GE version if it is currently deployed on a server. To undeploy it, use the unload groupor unload server commands to remove the application profile first.


unload group group applicationprofile applicationprofile type GridEngine

unload server server applicationprofile applicationprofile type GridEngine
Steps
  1. Access the N1SM CLI (see Accessing the N1SM CLI).

  2. If you want to delete an N1GE version from N1 System Manager, use this command.


    delete application application type GridEngine
    application

    The name of the N1GE version that you specified with the .Create Application command.

    type

    The type of application that the profile belongs to; in this case, GridEngine.


Example 2–3 Delete N1GE Version


N1-ok>delete application N1GE6_U4 type GridEngine

What's Next?

After you create an N1GE version, you must create an application profile and associate it with a particular version before your can provision the version to servers. The next section describes how to do this task.

Creating and Managing N1GE Application Profiles

This section describes how you create and manage an N1GE application profile. An application profile describes the deployment and functional attributes for an N1GE version. This section lists the following topics:

ProcedureTo Create an N1GE Application Profile

Before You Begin

After you create the N1GE version, you create an application profile and associate it with the version. This profile is sort of like a configuration file for an N1GE version (although it is actually a set of database values). The profile specifies attributes like which TCP port to use for the N1GE execd daemon or the threshold values that will provoke a warning when exceeded.

You can have several application profiles associated with a version but only one profile can be active for a grid at any particular time. It is the application profile that you specify when you deploy N1GE onto a managed server.


Tip –

This functionality is similar to that provided by the Settings menu choice of the GEMM application.



Note –

The actual role that a server plays in a grid (master host and so forth) is not an application profile attribute. You define that role when you load the application profile onto the targeted server.


Steps
  1. Access the N1SM CLI (see Accessing the N1SM CLI).

  2. Use the following command to create an application profile. If you are satisfied with the default N1GE attributes, you do not have to specify them explicitly. The syntax for this command is:


    create applicationprofile applicationprofile application application type GridEngine
    [N1GE-Attribute attributevalue, N1GE-Attribute attributevalue, ...]
    applicationprofile

    A unique name for the application profile that will be used to provision the various N1GE servers.

    application

    The name of the particular N1GE version to associate with this application profile. This value is the name you specified with the create application command.

    type

    The type of application that the profile belongs to; in this case, GridEngine.

    N1GE-Attribute

    The specific N1GE attribute you want to define.

    N1GE ATTRIBUTES — These attributes define how an application version will be deployed and function when the profile they belong to becomes active. You can have several application profiles but only one profile can be active for a grid at any particular time.

    • adminhomedir – The home directory of the N1GE admin user. Default value is /gridware/sge.

    • adminuid – The UID of the N1GE admin user. Default value is 218.

    • adminusernameThe user name of the N1GE admin user. Default value is sgeadmin.

    • execdport – The TCP port to use for the N1GE execd daemon. Default value is 837.

    • instversion – The version of N1GE that will be deployed on the compute and submit hosts. There is no default value.

    • lnxnfsmtopts – The options used when mounting the common directory onto a Linux compute or submit host. The value in this field is inserted into the Linux /etc/fstab file on each host as: nfsservername:nfsmountpoint nfsmountpoint nfs lnxnfsmtopts 0 0. Default value is intr,softload. This value cannot contain spaces.

    • loadcritical – Use this parameter to specify the load critical threshold. If this threshold is exceeded, a load critical alert appears in the Monitor. Similar to the Load Warning parameter, you set this parameter in terms of the system load scaled by number of CPUs. Default value is 3.00.

    • loadwarning – Use this parameter to specify the load warning threshold. If this threshold is exceeded, a load warning alert appears in the Monitor. The value is in terms of system load, as reported by the OS, divided by the number of CPUs. Default value is 1.00.

    • masterport – The CP port to use for the N1GE qmaster daemon. Default value is 836.

    • maxpendtime – Use this parameter to specify the amount of time that a job spends pending after which a Job Pending alert appears in the Monitor. You set the value in hours. Default value is 24.

    • memcritical – Use this parameter to set the memory critical threshold. If the value drops below this threshold, a memory critical alert appears in the Monitor. You set the value in terms of megabytes of free virtual memory. Default value is 10.

    • memwarning – Use this parameter to set the memory warning threshold. If the value drops below this threshold, a memory warning alert appears in the Monitor. You set the parameter value in terms of megabytes of free virtual memory. Default value is 100.

    • nfsmountpoint – The directory that is mounted from the NFS server for the N1GE common directory. When deploying the master host using N1GE, this value is set automatically to sgeroot/sgecell/common. Once you deploy the master host, you cannot edit this value and it remains in effect for all further deployments of compute and submit hosts. You can edit this setting again only if you uninstall the master host. Default value is /gridware/sge/default/common.

    • nfsservername – The name of the NFS server from which all compute and submit hosts will mount the N1GE “common” directory. When you deploy the master host using N1GE, this parameter is set automatically to the master host. Once you deploy the master host, you cannot edit this value and it remains in effect for all further deployments of compute and submit hosts. You can edit this setting again only if you uninstall the master host. There is no default value.

    • proxyhost – Indicates the host on which monitoring commands are executed. If the master host has been previously deployed using N1GE, then the proxy host is set to this host and cannot be changed until the master is uninstalled. The host you chose must be an N1GE admin host; otherwise, installation and uninstallation of other hosts, as well as monitoring, could fail. There is no default value.

    • sgecell – The N1GE cell name used for the deployment. Default value is default.

    • sgeroot – The root directory under which the N1GE files will be installed. The files will be installed on all hosts in this directory. Default value is /gridware/sge.

    • solnfsmtopts – The options used when mounting the “common” directory onto a Solaris compute or submit host. The value in this field is inserted into the Solaris /etc/vfstab file on each host as: nfsservername:nfsmountpoint nfsmountpoint nfs -yes solnfsmtopts. There is no default value. This value cannot contain spaces.


Example 2–4 Create an Application Profile


N1-ok>create applicationprofile N1GE6_U4_Profile application GE6U4 type GridEngine

ProcedureTo View Available N1GE Application Profiles

Use the show applicationprofile command to list all available application profiles or detailed information about a specific application profile.

Steps
  1. Access the N1SM CLI (see Accessing the N1SM CLI).

  2. To list all the available N1GE application profiles use:


    show applicationprofile all type GridEngine
  3. To list detailed information about a specific N1GE application profile use:


    show applicationprofile applicationprofile type GridEngine
    all

    List all the available N1GE application profiles.

    applicationprofile

    The name of a particular N1GE application profile.

    type

    The type of application that the profile belongs to ; in this case, GridEngine.


Example 2–5 Show an Application Profile


N1-ok>show applicationprofile [all] type GridEngine

N1-ok>show applicationprofile N1GE6_U4_Profile type GridEngine

The following is an example of a typical application profile produced by a show applicationprofilecommand.


Name:             p1
Application Name:
Type:             GridEngine
Active:           false
adminhomedir:     /gridware/sge
adminuid:         218
adminusername:    sgeadmin
execdport:        837
instversion:
lnxnfsmtopts:     defaults
loadcritical:     3
loadwarning:      1
masterhost:
masterport:       836
masterready:
maxpendtime:      24
memcritical:      10
memwarning:       100
nfsmountpoint:    /gridware/sge/default/common
nfsservername:
proxyhost:
proxyisadmin:
sgecell:          default
sgeroot:          /gridware/sge
solnfsmtopts:

ProcedureTo Delete an N1GE Application Profile

Before You Begin

You cannot delete an N1GE application profile if a master host installed with that profile has not been uninstalled first. To remove N1GE from the Master Host, use the unload server command.

Steps
  1. Access the N1SM CLI (see Accessing the N1SM CLI).

  2. Use the delete applicationprofile command to delete an N1GE application profile. The command syntax is:


    delete applicationprofile applicationprofile type GridEngine
    applicationprofile

    The name of the N1GE application profile that you want to delete.

    type

    The type of application that the profile belongs to; in this case, GridEngine.


Example 2–6 Delete an N1GE Application Profile


N1-ok>delete applicationprofile N1GE6_U4_Profile type GridEngine

What's Next?

After you have created an application profile for an N1GE version, you can use the profile to provision a grid engine system with it. See Installing N1 Grid Engine onto Servers.

Managing N1GE Settings

N1GE settings are global values reflecting the attributes of a particular application profile. You can have several application profiles but only one profile can be active for a grid at any particular time. The settings are described in the attributes section of the create applicationprofile command. To see the settings for a particular profile, use this command:


show applicationprofile applicationprofile type GridEngine

where applicationprofile is the name of the profile whose settings you want to see.

The following is an example of a typical application profile produced by a show applicationprofilecommand.


Name:             p1
Application Name:
Type:             GridEngine
Active:           false
adminhomedir:     /gridware/sge
adminuid:         218
adminusername:    sgeadmin
execdport:        837
instversion:
lnxnfsmtopts:     defaults
loadcritical:     3
loadwarning:      1
masterhost:
masterport:       836
masterready:
maxpendtime:      24
memcritical:      10
memwarning:       100
nfsmountpoint:    /gridware/sge/default/common
nfsservername:
proxyhost:
proxyisadmin:
sgecell:          default
sgeroot:          /gridware/sge
solnfsmtopts:

Changing Application Profile Settings

The active application profile is the one that was used when the master host was installed. You can change some of these global settings when the application profile is active and they are applied to the grid as a whole. However, you cannot change the settings of a particular server. For an inactive profile, you can change any of the settings.

You can only change the following settings for an active profile when the master host is managed by N1SM:

You can only change the following settings for an active profile when the Master host is an external host (proxy host is being used):

ProcedureTo Change an Application Profile Setting

Steps
  1. Access the N1SM CLI (see Accessing the N1SM CLI).

  2. If the profile you want to change is currently active, unload the profile (see To Unload N1GE From a Managed Server).

  3. Edit the profile to make the desired changes.

  4. Reload the profile (see To Load N1GE Onto Managed Servers).

Installing N1 Grid Engine onto Servers

You can install N1GE versions onto managed server groups or onto individual servers. The method of installation is to load an application profile onto a server while specifying the server's N1GE role.


Note –

You cannot install the master host with the load group command. To create an N1GE master host, use the load server command.


ProcedureTo Load N1GE Onto a Managed Server Group

Before You Begin

To deploy N1GE, you must previously have created an application (specifying a particular N1GE version) and an associated application profile (specifying the installation parameters).

Steps
  1. Access the N1SM CLI (see Accessing the N1SM CLI).

  2. Use the load group command to install an N1GE version onto a group of servers. This is the command syntax:


    load group group applicationprofile applicationprofile 
    type GridEngine hosttype [hosttype]
    group

    The name of a server group. To create a server group, use the N1SM create group command.

    applicationprofile

    The name of the N1GE application profile that you want to load.

    type

    The type of application that the profile belongs to; in this case, GridEngine.

    hosttype

    The type of N1 Grid Engine host to install. Valid values are compute (also known as an execution host) and submit (also known as an access host).


Example 2–7 Loading N1GE onto a Server Group


N1-ok>load group MyComputeServers applicationprofile N1GE6_U4_profile type GridEngine hosttype compute

ProcedureTo Load N1GE Onto Managed Servers

Steps
  1. Access the N1SM CLI (see Accessing the N1SM CLI).

  2. Use the load server command to install N1GE on one or several managed servers. This is the command syntax:


    load server server[,server...] applicationprofile applicationprofile type GridEngine hosttype [hosttype]
    server

    The management name of a server.

    applicationprofile

    The name of the N1GE application profile that you want to load.

    type

    The type of application that the profile belongs to; in this case, GridEngine.

    hosttype

    The type of N1 Grid Engine host to install. Valid values are compute (also known as execution host), submit (also known as an access host), and master.


Example 2–8 Loading N1GE on a Master Host


N1-ok>load server MyMasterHost applicationprofile N1GE6_U4_profile 
type GridEngine hosttype master

ProcedureTo Unload N1GE From Managed Server Group

Before You Begin

You cannot use the unload group command to uninstall a N1GE master host; you must use the unload server command.

Steps
  1. Access the N1SM CLI (see Accessing the N1SM CLI).

  2. Use the unload group command to uninstall N1GE from a group of servers. This is the command syntax:


    unload group group applicationprofile applicationprofile type GridEngine
    group

    The name of a server group. To create a server group, see the N1SM create group command.

    applicationprofile

    The name of the N1GE application profile that you want to unload.

    type

    The type of application that the profile belongs to; in this case, GridEngine.


Example 2–9 Unloading a N1GE From a Managed Server Group


N1-ok>unload group MyComputeServers applicationprofile N1GE6_U4_profile type GridEngine

ProcedureTo Unload N1GE From a Managed Server

Steps
  1. Access the N1SM CLI (see Accessing the N1SM CLI).

  2. Use the unload server command to uninstall N1GE from one or more servers. The command syntax is:


    unload server server[,server...] applicationprofile applicationprofile type GridEngine
    server

    The management name of a server.

    applicationprofile

    The name of the N1GE application profile that you want to unload.

    type

    The type of application that the profile belongs to; in this case, GridEngine.


Example 2–10 Unloading a Profile from a Managed Server


N1-ok>unload server MyMasterHost applicationprofile N1GE6_U4_profile type GridEngine