Sun N1 Grid Engine 6.1 Installation Guide

Chapter 2 Installing the N1 Grid Engine Software Interactively

This chapter discusses the steps that you must take to manually install the grid engine software.


Note –

The instructions in this chapter assume that you are installing the software on a computer running the Solaris™ Operating System. Any difference in functionality created by other operating system architecture that grid engine software runs on is documented in files starting with the string arc_depend_ in the sge-root/doc directory. The remainder of the file name indicates the operating system architectures to which the comments in the files apply, as in the arc_depend_irix.asc file.

Also note that there are several prerequisites that you must satisfy for Windows systems before you can install N1 Grid Engine 6.1. See Appendix A, Microsoft Services For UNIX.


This chapter discusses the following topics:


Note –

This chapter does not cover the upgrade process or the installation of the Accounting and Reporting Module, ARCo. For information about upgrading, see Chapter 5, Upgrading From a Previous Release of N1 Grid Engine Software. For information about installing ARCo, see Chapter 8, Installing the Accounting and Reporting Console.


Interactive Installation Overview


Note –

The instructions in this section are for a new grid engine system only. For instructions on how to install a new system with additional security protection, see Chapter 4, Installing the Increased Security Features. For instructions on how to upgrade an existing installation of an earlier version of grid engine software, see Chapter 5, Upgrading From a Previous Release of N1 Grid Engine Software.


Full installation includes the following tasks:

Performing an Installation

This section includes instructions for performing the following tasks:

The following sections describe how to install all the components of the grid engine system, including the master, execution, administration, and submit hosts. If you need to install the system with enhanced security, see Chapter 4, Installing the Increased Security Features before you continue installation.

ProcedureHow to Install the Master Host

The master installation procedure creates the appropriate directory hierarchy required by sge_qmaster and sge_schedd. The procedure starts up the grid engine system daemons, sge_qmaster and sge_schedd, on the master host. The master host is also registered as a host with administrative and submit permission. The installation procedure creates a default configuration for the system on which it is run. The installation script queries the system for the type of operating system. The script then makes meaningful settings based on this information.

If, at any time during the installation, you think something went wrong, you can quit the installation procedure and restart it.

Before You Begin

If you have decided to use an administrative user, as described in User Names, you should create that user now. This procedure assumes that you have already extracted the grid engine software, as described in Loading the Distribution Files on a Workstation.


Note –

Windows hosts cannot act as master hosts.


  1. Log in to the master host as root.

  2. If the $SGE_ROOT environment variable is not set, set it by typing:


    # SGE_ROOT=sge-root; export SGE_ROOT
    

    To confirm that you have set the $SGE_ROOT environment variable, type:


    # echo $SGE_ROOT
    
  3. Change to the installation directory.

    • If the directory where the installation files reside is visible from the master host, change directories (cd) to the installation directory sge-root, and then proceed to Step 4.

    • If the directory is not visible and cannot be made visible, do the following:

      1. Create a local installation directory, sge-root, on the master host.

      2. Copy the installation files to the local installation directory sge-root across the network, for example, by using ftp or rcp.

      3. Change directories (cd) to the local sge-root directory.

  4. Type the install_qmaster command, adding the -csp flag if you are installing using the Certificate Security Protocol method described in Chapter 4, Installing the Increased Security Features.

    This command starts the master host installation procedure. You are asked several questions, and you might be required to run some administrative actions.


    % ./install_qmaster
    Welcome to the Grid Engine installation
    ---------------------------------------
    
    Grid Engine qmaster host installation
    -------------------------------------
    
    Before you continue with the installation please read these hints:
    
       - Your terminal window should have a size of at least
         80x24 characters
    
       - The INTR character is often bound to the key Ctrl-C.
         The term >Ctrl-C< is used during the installation if you
         have the possibility to abort the installation
    
    The qmaster installation procedure will take approximately 5-10 minutes.
    
    Hit <RETURN> to continue >> 
  5. Choose an administrative account owner.


    Choosing Grid Engine admin user account
    ---------------------------------------
    
    You may install Grid Engine that all files are created with the user id of an
    unprivileged user.
    
    This will make it possible to install and run Grid Engine in directories
    where user >root< has no permissions to create and write files and directories.
    
       - Grid Engine still has to be started by user >root<
    
       - this directory should be owned by the Grid Engine administrator
    
    Do you want to install Grid Engine
    under an user id other than >root< (y/n) [y] >> y
    

    Choosing a Grid Engine admin user name
    --------------------------------------
    
    Please enter a valid user name >> sgeadmin
    
    Installing Grid Engine as admin user >sgeadmin<
    
    Hit <RETURN> to continue >>
  6. Verify the sge-root directory setting.

    In the following example, the value of sge-root is /opt/n1ge6.


    Checking $SGE_ROOT directory
    ----------------------------
    
    The Grid Engine root directory is:
    
       $SGE_ROOT = /opt/n1ge6
    
    If this directory is not correct (e.g. it may contain an automounter
    prefix) enter the correct path to this directory or hit <RETURN>
    to use default [/opt/n1ge6] >> 
    
    Your $SGE_ROOT directory: /opt/n1ge6
    
    Hit <RETURN> to continue >> 
  7. Set up the TCP/IP services for the grid engine software.

    1. You will be notified if the TCP/IP services have not been configured.


      Grid Engine TCP/IP service >sge_qmaster<
      ----------------------------------------
      
      There is no service >sge_qmaster< available in your >/etc/services< file
      or in your NIS/NIS+ database.
      
      You may add this service now to your services database or choose a port number.
      It is recommended to add the service now. If you are using NIS/NIS+ you should
      add the service at your NIS/NIS+ server and not to the local >/etc/services<
      file.
      
      Please add an entry in the form
      
         sge_qmaster <port_number>/tcp
      
      to your services database and make sure to use an unused port number.
      
      Please add the service now or press <RETURN> to go to entering a port number >> 
    2. Start a new terminal session or window to add the information /etc/services file or your NIS maps.

    3. Add the correct ports to the /etc/services file or your NIS services map, as described in Network Services.

      The following example shows how you might edit your /etc/services file.


      ...
      sge_qmaster     6444/tcp
      sge_execd       6445/tcp
      

      Note –

      In this example, the entries for both sge_qmaster and sge_execd are added to /etc/services. Subsequent steps in this example assume that both entries have been made.


      Save your changes.

    4. Return to the window where the installation script is running.


      Please add the service now or press <RETURN> to go to entering a port number >> 

      Press the Return key. You will see the following output:


      sge_qmaster 6444
      
      Service >sge_qmaster< is now available.
      
      Hit <RETURN> to continue >> 

      Grid Engine TCP/IP service >sge_execd<
      --------------------------------------
      
      Using the service
      
         sge_execd
      
      for communication with Grid Engine.
      
      Hit <RETURN> to continue >> 
  8. Type the name of your cell.

    The use of grid engine system cells is described in Cells.


    Grid Engine cells
    -----------------
    
    Grid Engine supports multiple cells.
    
    If you are not planning to run multiple Grid Engine clusters or if you don't
    know yet what is a Grid Engine cell it is safe to keep the default cell name
    
       default
    
    If you want to install multiple cells you can enter a cell name now.
    
    The environment variable
    
       $SGE_CELL=<your_cell_name>
    
    will be set for all further Grid Engine commands.
    
    Enter cell name [default] >> 
    • If you have decided to use cells, type the cell name now.

    • If you have decided not to use cells, press the Return key to continue.


      Using cell >default<. 
      Hit <RETURN> to continue >> 

    Press the Return key to continue.

  9. Specify a spool directory.

    For guidelines on disk space requirements for the spool directory, see Disk Space Requirements. For information on where spool directory is installed, see Spool Directories Under the Root Directory.


    Grid Engine qmaster spool directory
    -----------------------------------
    
    The qmaster spool directory is the place where the qmaster daemon stores
    the configuration and the state of the queuing system.
    
    The admin user >sgeadmin< must have read/write access
    to the qmaster spool directory.
    
    If you will install shadow master hosts or if you want to be able to start
    the qmaster daemon on other hosts (see the corresponding section in the
    Grid Engine Installation and Administration Manual for details) the account
    on the shadow master hosts also needs read/write access to this directory.
    
    The following directory
    
    [/opt/n1ge6/default/spool/qmaster]
    
    will be used as qmaster spool directory by default!
    
    Do you want to select another qmaster spool directory (y/n) [n] >> 
    • If you want to accept the default spool directory, press the Return key to continue.

    • If you do not want to accept the default spool directory, then answer y.

      In the following example the /my/spool directory is specified as the master host spool directory.


      Do you want to select another qmaster spool directory (y/n) [n] >> y
      
      Please enter a qmaster spool directory now! >>/my/spool
      
  10. The next question concerns Windows-based execution hosts.

    If you do not plan to use Windows support, answer No. If you want Windows support, answer Yes.

    If you answer yes, you will be asked some Windows-specific questions further on in the installation process. These questions will be marked as WINDOWS ONLY.


    Windows Execution Host Support
    ------------------------------
                                                                                    
    Are you going to install Windows Execution Hosts? (y/n) [n]
  11. Verify or set the correct file permissions.

    If you used pkgadd or you know that the file permissions are correct, you should answer Yes. Answering No will direct the script to set the permissions for you as shown in the next step.


    Verifying and setting file permissions
    --------------------------------------
    
    Did you install this version with >pkgadd< or did you already
    verify and set the file permissions of your distribution (y/n) [y] >> y
    
  12. Set the correct file permissions.

    • WINDOWS ONLY – If you specified that you wanted Windows Execution Host support in the previous question, you should let the script set the file permissions for you. Answer No to the following question.


      Verifying and setting file permissions
      --------------------------------------
      
      Did you install this version with >pkgadd< or did you already
      verify and set the file permissions of your distribution (y/n) [y] >> 
      
      In some cases, eg: the binaries are stored on a NTFS or on any other 
      filesystem, which provides additional file permissions, the UNIX file 
      permissions can be wrong. In this case we would advise to verify and 
      to set the file permissions (enter: n) (y/n) [n] >>n
      
    • Verify and set file permissions.


      Verifying and setting file permissions
      --------------------------------------
      
      We may now verify and set the file permissions of your Grid Engine
      distribution.
      
      This may be useful since due to unpacking and copying of your distribution
      your files may be unaccessible to other users.
      
      We will set the permissions of directories and binaries to
      
         755 - that means executable are accessible for the world
      
      and for ordinary files to
      
         644 - that means readable for the world
      
      Do you want to verify and set your file permissions (y/n) [y] >> y
      

      Verifying and setting file permissions and owner in >3rd_party<
      Verifying and setting file permissions and owner in >bin<
      Verifying and setting file permissions and owner in >ckpt<
      Verifying and setting file permissions and owner in >examples<
      Verifying and setting file permissions and owner in >install_execd<
      Verifying and setting file permissions and owner in >install_qmaster<
      Verifying and setting file permissions and owner in >mpi<
      Verifying and setting file permissions and owner in >pvm<
      Verifying and setting file permissions and owner in >qmon<
      Verifying and setting file permissions and owner in >util<
      Verifying and setting file permissions and owner in >utilbin<
      Verifying and setting file permissions and owner in >catman<
      Verifying and setting file permissions and owner in >doc<
      Verifying and setting file permissions and owner in >man<
      Verifying and setting file permissions and owner in >inst_sge<
      Verifying and setting file permissions and owner in >bin<
      Verifying and setting file permissions and owner in >lib<
      Verifying and setting file permissions and owner in >utilbin<
      
      Your file permissions were set
      
      Hit <RETURN> to continue >> 
  13. Specify whether all of your grid engine system hosts are located in a single DNS domain.


    Select default Grid Engine hostname resolving method
    ----------------------------------------------------
    
    Are all hosts of your cluster in one DNS domain? If this is
    the case the hostnames
    
       >hostA< and >hostA.foo.com<
    
    would be treated as equal, because the DNS domain name >foo.com<
    is ignored when comparing hostnames.
    
    Are all hosts of your cluster in a single DNS domain (y/n) [y] >>   
    • If all of your grid engine system hosts are located in a single DNS domain, then answer y.


      Are all hosts of your cluster in a single DNS domain (y/n) [y] >> y 
      
      Ignoring domainname when comparing hostnames.
      
      Hit <RETURN> to continue >> 
    • If all of your grid engine system hosts are not located in a single DNS domain, then answer n.


      Are all hosts of your cluster in a single DNS domain (y/n) [y] >> n 
      
      The domainname is not ignored when comparing hostnames.
      
      Hit <RETURN> to continue >> 

      Default domain for hostnames
      ----------------------------
      
      Sometimes the primary hostname of machines returns the short hostname
      without a domain suffix like >foo.com<.
      
      This can cause problems with getting load values of your execution hosts.
      If you are using DNS or you are using domains in your >/etc/hosts< file or
      your NIS configuration it is usually safe to define a default domain
      because it is only used if your execution hosts return the short hostname
      as their primary name.
      
      If your execution hosts reside in more than one domain, the default domain
      parameter must be set on all execution hosts individually.
      
      Do you want to configure a default domain (y/n) [y] >> 

      Press the Return key to continue.

      1. If you want to specify a default domain, then answer y.

        In the following example, sun.com is specified as the default domain.


        Do you want to configure a default domain (y/n) [y] >> y
        
        
        Please enter your default domain >> sun.com
        
        Using >sun.com< as default domain. Hit <RETURN> to continue >>
      2. If you do not want to specify a default domain, then answer n.

        In the following example, sun.com is specified as the default domain.


        Do you want to configure a default domain (y/n) [y] >> n
        
  14. Press the Return key to continue.


    Making directories
    ------------------
    
    creating directory: default/common
    creating directory: /opt/n1ge6/default/spool/qmaster
    creating directory: /opt/n1ge6/default/spool/qmaster/job_scripts
    Hit <RETURN> to continue >> 
  15. Specify whether you want to use classic spooling or Berkeley DB.

    For more information on how to determine the type of spooling mechanism you want, please see Choosing Between Classic Spooling and Database Spooling.


    Setup spooling
    --------------
    Your SGE binaries are compiled to link the spooling libraries
    during runtime (dynamically). So you can choose between Berkeley DB 
    spooling and Classic spooling method.
    Please choose a spooling method (berkeleydb|classic) [berkeleydb] >> 
    • Tto specify Berkeley DB spooling, press the Return key to continue.


      Please choose a spooling method (berkeleydb|classic) [berkeleydb] >> 

      The Berkeley DB spooling method provides two configurations!
      
      1) Local spooling:
      The Berkeley DB spools into a local directory on this host (qmaster host)
      This setup is faster, but you can't setup a shadow master host
      
      2) Berkeley DB Spooling Server:
      If you want to setup a shadow master host, you need to use
      Berkeley DB Spooling Server!
      In this case you have to choose a host with a configured RPC service.
      The qmaster host connects via RPC to the Berkeley DB. This setup is more
      failsafe, but results in a clear potential security hole. RPC communication
      (as used by Berkeley DB) can be easily compromised. Please only use this
      alternative if your site is secure or if you are not concerned about
      security. Check the installation guide for further advice on how to achieve
      failsafety without compromising security.
      
      Do you want to use a Berkeley DB Spooling Server? (y/n) [n] >> 
      • To use a Berkeley DB spooling server, enter y.


        Do you want to use a Berkeley DB Spooling Server? (y/n) [n] >> y
        
        Berkeley DB Setup
        
        -----------------
        Please, log in to your Berkeley DB spooling host and execute "inst_sge -db"
        Please do not continue, before the Berkeley DB installation with
        "inst_sge -db" is completed, continue with <RETURN>
        

        Note –

        Do not press the Return key until you have completed the Berkeley DB installation on the spooling server.


        1. Start a new terminal session or window.

        2. Log in to the spooling server.

        3. Install the software, as described in How to Install the Berkeley DB Spooling Server.

        4. After you have installed the software on the spooling server, return to the master installation window, and press the Return key to continue.

        5. Type the name of the spooling server.

          In the following example, vector is the host name of the spooling server.


          Berkeley Database spooling parameters
          -------------------------------------
          
          Please enter the name of your Berkeley DB Spooling Server! >> vector
          
        6. Type the name of the spooling directory.

          In the following example, /opt/n1ge6/default/spooldb is the spooling directory.


          Please enter the Database Directory now!
          
          Default: [/opt/n1ge6/default/spooldb] >> 
          Dumping bootstrapping information
          Initializing spooling database
          
          Hit <RETURN> to continue >> 
      • If you do not want to use a Berkeley DB spooling server, type n.


        Do you want to use a Berkeley DB Spooling Server? (y/n) [n] >> n
        
        
        Hit <RETURN> to continue >> 

        Berkeley Database spooling parameters
        -------------------------------------
        
        Please enter the Database Directory now, even if you want to spool locally
        it is necessary to enter this Database Directory. 
        
        Default: [/opt/n1ge6/default/spool/spooldb] >> 

        Specify an alternate directory, or press the Return key to continue.


        creating directory: /opt/n1ge6/default/spool/spooldb
        Dumping bootstrapping information
        Initializing spooling database
        
        Hit <RETURN> to continue >> 
    • To specify classic spooling, type classic.


      Please choose a spooling method (berkeleydb|classic) [berkeleydb] >> classic
      

      Dumping bootstrapping information
      Initializing spooling database
      
      Hit <RETURN> to continue >> 
  16. Type a group ID range

    For more information, see Group IDs.


    Grid Engine group id range
    --------------------------
    
    When jobs are started under the control of Grid Engine an additional group id
    is set on platforms which do not support jobs. This is done to provide maximum
    control for Grid Engine jobs.
    
    This additional UNIX group id range must be unused group id's in your system.
    Each job will be assigned a unique id during the time it is running.
    Therefore you need to provide a range of id's which will be assigned
    dynamically for jobs.
    
    The range must be big enough to provide enough numbers for the maximum number
    of Grid Engine jobs running at a single moment on a single host. E.g. a range
    like >20000-20100< means, that Grid Engine will use the group ids from
    20000-20100 and provides a range for 100 Grid Engine jobs at the same time
    on a single host.
    
    You can change at any time the group id range in your cluster configuration.
    
    Please enter a range >> 20000-20100
    
    Using >20000-20100< as gid range. Hit <RETURN> to continue >> 
  17. Verify the spooling directory for the execution daemon.

    For information on spooling, see Spool Directories Under the Root Directory.


    Grid Engine cluster configuration
    ---------------------------------
    
    Please give the basic configuration parameters of your Grid Engine
    installation:
    
       <execd_spool_dir>
    
    The pathname of the spool directory of the execution hosts. User >sgeadmin<
    must have the right to create this directory and to write into it.
    
    Default: [/opt/n1ge6/default/spool] >>  
  18. Type the email address of the user who should receive problem reports.

    In this example, the user who will receive problem reports is me@my.domain.


    Grid Engine cluster configuration (continued)
    ---------------------------------------------
    
    <administator_mail>
    
    The email address of the administrator to whom problem reports are sent.
    
    It's is recommended to configure this parameter. You may use >none<
    if you do not wish to receive administrator mail.
    
    Please enter an email address in the form >user@foo.com<.
    
    Default: [none] >> me@my.domain
    
  19. Verify the configuration parameters.


    The following parameters for the cluster configuration were configured:
    
       execd_spool_dir        /opt/n1ge6/default/spool
       administrator_mail     me@my.domain
    
    Do you want to change the configuration parameters (y/n) [n] >> n
    
    Creating local configuration
    ----------------------------
    Creating >act_qmaster< file
    Adding default complex attributes
    Reading in complex attributes.
    Adding default parallel environments (PE)
    Reading in parallel environments:
            PE "make".
    Adding SGE default usersets
    Reading in usersets:
            Userset "deadlineusers".
            Userset "defaultdepartment".
    Adding >sge_aliases< path aliases file
    Adding >qtask< qtcsh sample default request file
    Adding >sge_request< default submit options file
    Creating >sgemaster< script
    Creating >sgeexecd< script
    Creating settings files for >.profile/.cshrc<
    
    Hit <RETURN> to continue >> 
  20. WINDOWS-ONLY – If you specified that you want Windows support, you are asked to create Certificate Security Protocol (CSP) certificates.

    Read How to Install a CSP-Secured System for information about CSP certificates before you continue.

  21. Specify whether you want the daemons to start when the system is booted.


    qmaster/scheduler startup script
    --------------------------------
    
    We can install the startup script that will
    start qmaster/scheduler at machine boot (y/n) [y] >> y
    
    Installing startup script /etc/rc2.d/S95sgemaster
    
    Hit <RETURN> to continue >> 
    ...
  22. WINDOWS-ONLY – Add the Windows Administrator name to the SGE manager list.


    Windows Administrator Name
    --------------------------
                                                                                    
    For a later execution host installation it is recommended to add the
    Windows Administrator name to the SGE manager list
                                                                                    
    
    Please, enter the Windows Administrator name [Default: Administrator] >>
  23. Identify the hosts that you will later install as execution hosts.


    Adding Grid Engine hosts
    ------------------------
    
    Please now add the list of hosts, where you will later install your execution
    daemons. These hosts will be also added as valid submit hosts.
    
    Please enter a blank separated list of your execution hosts. You may
    press <RETURN> if the line is getting too long. Once you are finished
    simply press <RETURN> without entering a name.
    
    You also may prepare a file with the hostnames of the machines where you plan
    to install Grid Engine. This may be convenient if you are installing Grid
    Engine on many hosts.
    
    Do you want to use a file which contains the list of hosts (y/n) [n] >> n
    
    Adding admin and submit hosts
    -----------------------------
    
    Please enter a blank seperated list of hosts.
    
    Stop by entering <RETURN>. You may repeat this step until you are
    entering an empty list. You will see messages from Grid Engine
    when the hosts are added.
    
    Host(s): host1 host2 host3 host4
    
    host1 added to administrative host list
    host1 added to submit host list
    host2 added to administrative host list
    host2 added to submit host list
    host3 added to administrative host list
    host3 added to submit host list
    host4 added to administrative host list
    host4 added to submit host list
    Hit <RETURN> to continue >> 
    
    Creating the default <all.q> queue and <allhosts> hostgroup
    -----------------------------------------------------------
    
    root@vector added "@allhosts" to host group list
    root@vector added "all.q" to cluster queue list
    
    Hit <RETURN> to continue >> 
  24. Select a scheduler profile.

    For information on how to determine which profile you should use, see Scheduler Profiles.


    Scheduler Tuning
    ----------------
    
    The details on the different options are described in the manual. 
    
    Configurations
    --------------
    1) Normal
              Fixed interval scheduling, report scheduling information,
              actual + assumed load
    
    2) High
              Fixed interval scheduling, report limited scheduling information,
              actual load
    
    3) Max
              Scheduling on demand, report no scheduling information,
              actual load
    
    Enter the number of your preferred configuration and hit <RETURN>! 
    Default configuration is [1] >> 

    Once you answer this question, the installation process is complete. Several screens of information will be displayed before the script exits. The commands that are noted in those screens are also documented in this chapter.

  25. WINDOWS-ONLY – If you are using CSP mode, copy the certificate files to each execution host.

    You can use a script to perform this function.


    Tip –

    To use this functionality without being asked for a password, the root user should use rsh or ssh to access the execution hosts.



    Should the script try to copy the cert files, for you, to each
    execution host? (y/n) [y] >>
  26. Create the environment variables for use with the grid engine software.


    Note –

    If no cell name was specified during installation, the value of cell is default.


    • If you are using a C shell, type the following command:


      % source sge-root/cell/common/settings.csh
      
    • If you are using a Bourne shell or Korn shell, type the following command:


      $ . sge-root/cell/common/settings.sh
      
See Also

For details about how you can verify that the execution host has been set up correctly, see How to Verify That the Daemons Are Running on the Master Host.

ProcedureHow to Install Execution Hosts

The execution host installation procedure creates the appropriate directory hierarchy required by sge_execd, and starts the sge_execd daemon on the execution host. This section describes how to install execution hosts interactively from the command line. You can automate the installation of execution of multiple hosts by using the procedure described in Chapter 3, Automating the Installation Process.

Before You Begin

Before installing an execution host, you first need to have installed the master server as described in How to Install the Master Host and shared the common directory.

You must satisfy several prerequisites before you can install N1 Grid Engine execution hosts with Windows operating systems. You might have to install additional software on your computer. See Appendix A, Microsoft Services For UNIX.

On Microsoft Windows machines, additional steps are necessary before you can continue with the execution host installation. Follow the steps described in How to Install a CSP-Secured System-Steps 6a, 6b and 6c.


Note –

If you are using Microsoft Windows machines, after the installation, each user has to register their Windows password with N1 Grid Engine using the sgepasswd client application. (See Appendix B for more information.)


  1. Log in to the execution host as root.

  2. As you did for the master installation, either copy the installation files to a local installation directory sge-root or use a network installation directory.

  3. If the $SGE_ROOT environment variable is not set, set it by typing:


    # SGE_ROOT=sge-root; export SGE_ROOT
    

    To confirm that you have set the $SGE_ROOT environment variable, type:


    # echo $SGE_ROOT
    
  4. Change directory (cd) to the installation directory, sge-root.

  5. Verify that the execution host has been declared on the administration host.


    # qconf -sh
    
    • If you do not see the name of this execution host in the output of the qconf command, you will need to declare it as an administration host.

      1. Start a new terminal session or window.

      2. In that window, log into the master host.

      3. Declare the execution host as an administration host, using the qconf command.


        # qconf -ah quark
        quark added to administrative host list
         
      4. Log back out of the master host, and continue with the installation of the execution host.

  6. Run the install_execd command.

    If you are installing using the Certificate Security Protocol method described in Chapter 4, Installing the Increased Security Features, add the -csp option to the install_execd command.


    % ./install_execd
    

    This command starts the execution host installation procedure.


    Welcome to the Grid Engine execution host installation
    ------------------------------------------------------
    
    If you haven't installed the Grid Engine qmaster host yet, you must execute
    this step (with >install_qmaster<) prior the execution host installation.
    
    For a sucessful installation you need a running Grid Engine qmaster. It is
    also neccessary that this host is an administrative host.
    
    You can verify your current list of administrative hosts with
    the command:
    
       # qconf -sh
    
    You can add an administrative host with the command:
    
       # qconf -ah <hostname>
    
    The execution host installation will take approximately 5 minutes.
    
    Hit <RETURN> to continue >> 
  7. Verify the sge-root directory setting.

    In the following example, the value of sge-root is /opt/n1ge6.


    Checking $SGE_ROOT directory
    ----------------------------
    
    The Grid Engine root directory is:
    
       $SGE_ROOT = /opt/n1ge6
    
    If this directory is not correct (e.g. it may contain an automounter
    prefix) enter the correct path to this directory or hit <RETURN>
    to use default [/opt/n1ge6] >> 
    
    Your $SGE_ROOT directory: /opt/n1ge6
    
    Hit <RETURN> to continue >> 
  8. Type the name of your cell.

    The use of grid engine system cells is described in Cells.


    Grid Engine cells
    -----------------
    
    Grid Engine supports multiple cells.
    
    If you are not planning to run multiple Grid Engine clusters or if you don't
    know yet what is a Grid Engine cell it is safe to keep the default cell name
    
       default
    
    If you want to install multiple cells you can enter a cell name now.
    
    The environment variable
    
       $SGE_CELL=<your_cell_name>
    
    will be set for all further Grid Engine commands.
    
    Enter cell name [default] >> 
    • If you have decided to use cells, then type the cell names now.

    • If you have decided not to use cells, then press the Return key to continue.


      Using cell >default<. 
      Hit <RETURN> to continue >> 

    Press <RETURN> to continue.

  9. The install script checks to see whether the admin user already exists.

    If the admin user already exists, the script continues uninterrupted. If the admin user does not exist, the script shows the following screen where you must supply a password for the admin user. After the admin user is created, press the Return key to continue with the installation.


    Local Admin User
    ----------------
    
    The local admin user sgeadmin, does not exist!
    The script tries to create the admin user.
    Please enter a password for your admin user >>

    Creating admin user sgeadmin, now ...
    
    Admin user created, hit <ENTER> to continue!
  10. Press the Return key to continue.

    The script verifies that the execution host has been declared as an administration host.


    Checking hostname resolving
    ---------------------------
    
    This hostname is known at qmaster as an administrative host.
    
    Hit <RETURN> to continue >> 
  11. Specify whether you want to use a local spool directory.

    For information on spooling, see Spool Directories Under the Root Directory.


    Local execd spool directory configuration
    -----------------------------------------
    
    During the qmaster installation you've already entered a global
    execd spool directory. This is used, if no local spool directory is configured.
    
    Now you can configure a local spool directory for this host.
    ATTENTION: The local spool directory doesn't have to be located on a local
    drive. It is specific to the <local> host and can be located on network drives,
    too. But for performance reasons, spooling to a local drive is recommended.
    
    FOR WINDOWS USER: On Windows systems the local spool directory MUST be set
    to a local harddisk directory.
    Installing an execd without local spool directory makes the host unuseable.
    Local spooling on local harddisk is mandatory for Windows systems.
    
    Do you want to configure a local spool directory
    for this host (y/n) [n] >>
    • If you do not want a local spool directory, answer n.


      Do you want to configure a local spool directory
      for this host (y/n) [n] >> n
      

      Creating local configuration
      ----------------------------
      sgeadmin@host1 modified "host1" in configuration list
      
      Local configuration for host >host1< created.
      
      Hit <RETURN> to continue >> 
    • If you do want a local spool directory, answer y.

      In the following example, /var/tmp/spool is used as the local spool directory on host1. Choose any directory that meets the disk space requirements described in Disk Space Requirements.


      Do you want to configure a local spool directory
      for this host (y/n) [n] >> y
      
      Please enter the local spool directory now! >> /var/tmp/spool
      Using local execd spool directory [/var/tmp/spool]
      Hit <RETURN> to continue >> 

      Creating local configuration
      ----------------------------
      sgeadmin@host1 modified "host1" in configuration list
      
      Local configuration for host >host1n< created.
      
      Hit <RETURN> to continue >> 
  12. Specify whether you want execd to start automatically at boot time.

    You might not want to install the startup script if you are installing a test cluster or you would rather start the daemon manually on reboot.


    execd startup script
    --------------------
    
    We can install the startup script that will
    start execd at machine boot (y/n) [y] >> y
    
    Installing startup script /etc/rc2.d/S95sgeexecd
    
    Hit <RETURN> to continue >> 
    1. WINDOWS ONLY – Choose whether to display the GUI for Windows jobs.

      An N1 Grid Engine Helper Service is included with the N1 Grid Engine 6.1 distribution. This service enables Windows jobs to display a GUI on the visible desktop of the execution host. The visible desktop is either the desktop of the user currently logged in on the execution host or the desktop of the next user who will log in. It is not the log in screen.

      The Helper Service is a independent component loosely coupled with the execution daemon. The startup of the Helper Service is plugged in the Services dialog box in the Windows control panel. You can install only one Helper Service per host. There can be only one execution daemon installed per Helper Server.

      The installation script asks during the installation of a execution host whether you want to see the GUI of Windows jobs.


      SGE Windows Helper Service Installation
      ---------------------------------------
      
      If you're going to run Windows job's using GUI support, you have
      to install the Windows Helper Service
      Do you want to install the Windows Helper Service? (y/n) [n] >>
    2. Start the execution daemon.


      Grid Engine execution daemon startup
      ------------------------------------
      
      Starting execution daemon. Please wait ...
         starting sge_execd
      
      Hit <RETURN> to continue >> 
  13. Specify a queue for this host.


    Adding a queue for this host
    ----------------------------
    
    We can now add a queue instance for this host:
    
       - it is added to the >allhosts< hostgroup
       - the queue provides 1 slot(s) for jobs in all queues
         referencing the >allhosts< hostgroup
    
    You do not need to add this host now, but before running jobs on this host
    it must be added to at least one queue.
    
    Do you want to add a default queue instance for this host (y/n) [y] >> 

    Once you answer this question, the installation process is complete. Several screens of information will be displayed before the script exits. The commands that are noted in those screens are also documented in this chapter.

  14. Create the environment variables for use with the grid engine software.


    Note –

    If no cell name was specified during installation, the value of cell is default.


    • If you are using a C shell, type the following command:


      % source sge-root/cell/common/settings.csh
      
    • If you are using a Bourne shell or Korn shell, type the following command:


      $ . sge-root/cell/common/settings.sh
      
See Also

For details about how you can verify that the execution host has been set up correctly, see How to Verify That the Daemons Are Running on the Execution Hosts.

Registering Administration Hosts

The master host is implicitly allowed to run administrative tasks and to submit, monitor, and delete jobs. The master host does not require any kind of additional installation as administration. By contrast, pure administration hosts do require registration.


Note –

You can also install execution hosts by using the QMON graphical user interface. For information about how to complete this task using QMON, see Configuring Administration Hosts With QMON in Sun N1 Grid Engine 6.1 Administration Guide.


From the master host, using the grid engine system administrative account, for example, the sgeadmin account, type the following command:


% qconf -ah admin-host-name[,...]

Registering Submit Hosts


Note –

You can also install execution hosts by using the QMON graphical user interface. For information about how to complete this task using QMON, see Configuring Submit Hosts With QMON in Sun N1 Grid Engine 6.1 Administration Guide.


From the master host, using the grid engine system administrative account, for example, the sgeadmin account, type the following command:


% qconf -as submit-host-name[,...]

Refer to About Hosts and Daemons in Sun N1 Grid Engine 6.1 Administration Guide for more details and other means to configure the different host types.

ProcedureHow to Install the Berkeley DB Spooling Server

The installation procedure installs the grid engine software necessary for Berkeley DB spooling.

Before You Begin

The grid engine software must be loaded onto a local file system. For details on how to extract the files, see How to Load the Distribution Files On a Workstation.

  1. Log in to the spooling server host as root.

  2. If the $SGE_ROOT environment variable is not set, set it by typing:


    # SGE_ROOT=sge-root; export SGE_ROOT
    

    To confirm that you have set the $SGE_ROOT environment variable, type:


    # echo $SGE_ROOT
    
  3. Change to the installation directory.


    # cd $SGE_ROOT
    
  4. Type the inst_sge command with the -db option.


    # sge-root/inst_sge -db
    

    This command starts the spooling server installation procedure. You are asked several questions. If you think something went wrong, you can quit the installation procedure and restart it at any time.

  5. Choose an administrative account owner.


    Choosing Grid Engine admin user account
    ---------------------------------------
    
    You may install Grid Engine that all files are created with the user id of an
    unprivileged user.
    
    This will make it possible to install and run Grid Engine in directories
    where user >root< has no permissions to create and write files and directories.
    
       - Grid Engine still has to be started by user >root<
    
       - this directory should be owned by the Grid Engine administrator
    
    Do you want to install Grid Engine
    under an user id other than >root< (y/n) [y] >> y
    

    Choosing a Grid Engine admin user name
    --------------------------------------
    
    Please enter a valid user name >> sgeadmin
    Installing Grid Engine as admin user >sgeadmin<
    
    Hit <RETURN> to continue >>
  6. Verify the sge-root directory setting.

    In the following example, the value of sge-root is /opt/n1ge6.


    Checking $SGE_ROOT directory
    ----------------------------
    
    The Grid Engine root directory is:
    
       $SGE_ROOT = /opt/n1ge6
    
    If this directory is not correct (e.g. it may contain an automounter
    prefix) enter the correct path to this directory or hit <RETURN>
    to use default [/opt/n1ge6] >> 
    
    Your $SGE_ROOT directory: /opt/n1ge6
    
    Hit <RETURN> to continue >> 
  7. Type the name of your cell.

    The use of grid engine system cells is described in Cells.


    Grid Engine cells
    -----------------
    
    Grid Engine supports multiple cells.
    
    If you are not planning to run multiple Grid Engine clusters or if you don't
    know yet what is a Grid Engine cell it is safe to keep the default cell name
    
       default
    
    If you want to install multiple cells you can enter a cell name now.
    
    The environment variable
    
       $SGE_CELL=<your_cell_name>
    
    will be set for all further Grid Engine commands.
    
    Enter cell name [default] >> 
  8. Select Berkeley DB spooling.


    Setup spooling
    --------------
    Your SGE binaries are compiled to link the spooling libraries
    during runtime (dynamically). So you can choose between Berkeley DB 
    spooling and Classic spooling method.
    Please choose a spooling method (berkeleydb|classic) [berkeleydb] >> 
  9. Verify your host name.

    In this example, the installation script is being run on host2.


    Berkeley Database spooling parameters
    -------------------------------------
    
    You are going to install an RPC Client/Server mechanism!
    In this case, qmaster will
    contact an RPC server running on a separate server machine.
    If you want to use the SGE shadowd, you have to use the 
    RPC Client/Server mechanism.
    
    
    Enter database server name or 
    hit <RETURN> to use default [host2] >> 
  10. Type the directory path of your spooling directory.

    You might need to change this path if this directory is NFS mounted, or if you do not have write permissions to this directory.


    Enter the database directory
    or hit <RETURN> to use default [/opt/n1ge6/default//spooldb] >> 
    
    creating directory: /opt/n1ge6/default//spooldb
  11. Start the RPC server.


    Now we have to startup the rc script
    >/opt/n1ge6/default/common/sgebdb< 
    on the RPC server machine
    
    If you already have a configured Berkeley DB Spooling Server,
    you have to restart the Database with the rc script now and continue with >NO<
    
    Shall the installation script try to start the RPC server? (y/n) [y] >> y
    Starting rpc server on host host2!
    The Berkeley DB has been started with these parameters:
    
    Spooling Server Name: host2
    DB Spooling Directory: /opt/n1ge6/default//spooldb
    
    Please remember these values, during Qmaster installation
    you will be asked for them! Hit <RETURN> to continue!
  12. Specify whether you want Berkeley DB service to start automatically at boot time.


    Berkeley DB startup script
    --------------------------
    
    We can install the startup script that
    Grid Engine is started at machine boot (y/n) [y] >> y
    

    Once you answer this question, the installation process is complete.

  13. Create the environment variables for use with the grid engine software.


    Note –

    If no cell name was specified during installation, the value of cell is default.


    • If you are using a C shell, type the following command:


      % source sge-root/cell/common/settings.csh
      
    • If you are using a Bourne shell or Korn shell, type the following command:


      $ . sge-root/cell/common/settings.sh