Sun N1 System Manager 1.1 Administration Guide

Managing Jobs

This section describes jobs and how they are an integral part of server monitoring.

Each major action you take in the N1 System Manager starts a job. Use the job log to track the status on a currently running action or to verify that a job has finished. Monitoring jobs is useful particularly because some N1 System Manager actions can take a long time to finish. An example of such an action is installing an OS distribution on one or more provisionable servers.

You can track jobs through the Jobs tab in the browser interface or the show job command. The show job command provides information about most of the following characteristics:

Job ID

Generated unique identifier.

Date

Date on which the job was started.

Job Type

Type of job. See show job in Sun N1 System Manager 1.1 Command Line Reference Manual for details. When using the show job command with the type parameter, jobs can be any of the following types:

  • addbase Add base management support.

  • addbasemonitor Add OS monitoring support.

  • createos Create OS distribution from CD/DVD media or ISO files.

  • deletejob Delete job.

  • discover Server discovery.

  • loadfirmware Load firmware update.

  • loados Load OS.

  • loadupdate Load OS update.

  • refresh Server refresh.

  • removeosmonitor Remove OS monitoring support.

  • setagentip Modify OS monitoring support.

  • start Server power on.

  • stop Server power off.

  • unloadupdate Unload OS update.

State

State of the current job step. Job steps indicate the progress of a job and update results. Each job step has a type, a start time and, when the job completes, a completion time. For the purposes of filtering, job progress is indicated with the following states:

notstarted

Jobs in a notstarted state cannot be stopped.

preflight

When you select a job by ID and view the details of that job, each step of that job appears twice – the preflight check and the execution of the step itself.

running

The job is currently running. Jobs that are currently running cannot be deleted using the delete job command. Jobs that are currently running must finish running or be stopped using the stop job command.

Job completion is indicated with the following results:

completed

Indicates that the job step completed successfully.

warning

Indicates a warning during the job execution. A warning can be an issue reported that might or might not necessarily be severe enough to terminate the job step, and the job, with errors.

abort

Indicates that the job step stopped before it completed.

abort_pending

Indicates that the job is still running but that the job step cannot complete successfully.

error

Indicates a general error in that job step.

timed_out

Indicates that the job timed out before all of the job steps could complete successfully, or that the next step of the job started before the current step completed successfully.

Complete - Warning is issued in the output for an overall job status, if the job successfully completed all of its steps but there were one or more WARNING states issued for steps during the job execution and these warnings were not severe enough to terminate the job with errors.

You can filter jobs depending on their state. See show job in Sun N1 System Manager 1.1 Command Line Reference Manual for details.

Owner

The user who started the job. Also called the job creator.

Job Results

Provides details about the results of a completed job. You can review the standard output of remote command operations and completion statuses for all other job types.

ProcedureTo List Jobs

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. View the list of jobs.


    N1-ok> show job all
    

    A list of all jobs for the N1 System Manager is returned.

    See show job in Sun N1 System Manager 1.1 Command Line Reference Manual for details.


Example 5–10 Listing All Jobs

This example shows that using the show job command with the all option returns a list of jobs by Job ID, together with the date and time at which the job was started. The job type and status are also returned, along with the identity of the user who created the job.


N1-ok> show job all
Job ID          Date                       Type                  Status        Creator
7               2005-09-16T10:51:07-0700   Discovery             Completed      root
6               2005-09-14T14:42:52-0700   Server Reboot         Error          root
5               2005-09-14T14:38:25-0700   Server Power On       Completed      root
4               2005-09-14T14:29:20-0700   Server Power Off      Completed      root
3               2005-09-09T13:01:35-0700   Discovery             Completed      root
2               2005-09-09T12:38:16-0700   Discovery             Completed      root
1               2005-09-09T10:32:40-0700   Discovery             Completed      root

ProcedureTo View a Specific Job

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. View a specific job.


    N1-ok> show job job
    

    Detailed information about the job appears in the output.

    See show job in Sun N1 System Manager 1.1 Command Line Reference Manual for details.


Example 5–11 Viewing Job Details

This example shows that using the show job command with the Job ID returns the date and time at which the job was started, the job type and status, and the identity of the user who created the job. Further details are provided for each step of that job, including the time at which the step started and completed and whether the step was successful.


N1-ok> show job 5
Job ID:      5
Date:        2005-02-14T14:38:25-0700
Type:        Server Power On
Status:      Completed
Creator:     root
Errors:      0
Warnings:    0
Step 1:      
Type:        103
Description: native procedure /bin/sh /opt/sun/n1gc/bin/serverPowerOn.sh :[SERVER_NAME] :[JOBID_KEY]
Start:       2005-02-14T14:38:25-0700
Completion:  2005-02-14T14:38:25-0700
Result:      Complete
Exception:   No Data Available
Step 2:      
Type:        103
Description: native procedure /bin/sh /opt/sun/n1gc/bin/serverPowerOn.sh :[SERVER_NAME] :[JOBID_KEY]
Start:       2005-02-14T14:38:28-0700
Completion:  2005-02-14T14:38:35-0700
Result:      Complete
Exception:   No Data Available
Step 3:      
Type:        135
Description: connect and lock hosts
Start:       2005-02-14T14:38:25-0700
Completion:  2005-02-14T14:38:25-0700
Result:      Complete
Exception:   No Data Available
Step 4:      
Type:        135
Description: connect and lock hosts
Start:       2005-02-14T14:38:27-0700
Completion:  2005-02-14T14:38:28-0700
Result:      Complete
Exception:   No Data Available
Result 1:    
Server:      192.168.200.3
Status:      0
Message:     The server operation was successful.
N1-ok> 

Each step appears twice in the output. The first appearance of the step in the list is the preflight check, and the second appearance of the step in the list is the actual execution of the step.


ProcedureTo Stop a Job

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. Stop a specific job.


    N1-ok> stop job job
    

    The job is stopped.

    See stop job in Sun N1 System Manager 1.1 Command Line Reference Manual for details.

  3. View the job details.


    N1-ok> show job job
    

    The Result section of the output shows that the job was stopped.

    Any job can be stopped. In practice, however, only a job that is not in its last step can be stopped. Some jobs only have one step and so can never be stopped. Jobs in a notstarted state cannot be stopped. Operations that are performed on large groups of servers can take longer and might include a large number of steps.

    See show job in Sun N1 System Manager 1.1 Command Line Reference Manual for details.


Example 5–12 Stopping a Remote Command Job

This example shows that using the stop job command with the Job ID returns a message confirmed that the request has been received.


N1-ok> stop job 9

Stop Job "9" request received.

This example also shows that the show job command can be used with the Job ID of the job that was stopped to gain more data about the job that was stopped. This returns the confirmation, in Status, that the job was stopped, and that the job was a remote command job. Further details are provided for each step of that job, including the time at which the step started and completed and whether the step was successful. The Result section shows that the job was canceled.


N1-ok> show job 9

Job ID:   9
Date:     2005-02-15T16:43:58-0700
Type:     Remote Command
Status:   Stopped
Owner:    root
Errors:   0
Warnings: 0

Step 1:     
Type:        135
Description: connect and lock hosts
Start:       2005-02-15T16:43:58-0700
Completion:  2005-02-15T16:43:58-0700
Result:      Complete
Exception:   No Data Available

Step 2:     
Type:        103
Description: native procedure /bin/sh /opt/sun/n1gc/bin/remotecmd.sh
:[RCMD_KEY]
Start:       2005-02-15T16:43:58-0700
Completion:  2005-02-15T16:43:58-0700
Result:      Complete
Exception:   No Data Available

Step 3:     
Type:        135
Description: connect and lock hosts
Start:       2005-02-15T16:44:00-0700
Completion:  2005-02-15T16:44:00-0700
Result:      Complete
Exception:   No Data Available

Step 4:     
Type:        103
Description: native procedure /bin/sh /opt/sun/n1gc/bin/remotecmd.sh
:[RCMD_KEY]
Start:       2005-02-15T16:44:00-0700
Completion:  2005-02-15T16:44:49-0700
Result:      Incomplete - Aborted
Exception:   No Data Available

Result :        
Server:      server1
Status:      -1
Message:     Command running on server1 was canceled. Command:
/root/sleep.sh 60
Standard Output: Sleeping for 60 seconds...

Each step appears twice in the output. The first appearance of the step in the list is the preflight check, and the second appearance of the step in the list is the actual execution of the step.


See Also

To Issue Remote Commands on a Server or a Server Group

ProcedureTo Delete a Job

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. Determine the job you want to delete.


    N1-ok> show job all
    

    All jobs and job IDs appear in the output.

    See show job in Sun N1 System Manager 1.1 Command Line Reference Manual for details.

  3. Delete the desired job.


    N1-ok> delete job job
    

    The job is deleted.

    See delete job in Sun N1 System Manager 1.1 Command Line Reference Manual for details.

  4. Verify that the job was deleted.


    N1-ok> show job all
    

    The deleted job should not appear in the output.

    See show job in Sun N1 System Manager 1.1 Command Line Reference Manual for details.


Example 5–13 Deleting a Job

This example shows how to delete a job.

First, the show job command is used with the all option, which lists all jobs in descending order.


N1-ok> show job all
Job ID          Date                       Type                  Status           Creator
7               2005-02-16T10:51:07-0700   Discovery             Completed        root
6               2005-02-14T14:42:52-0700   Server Reboot         Error            root
5               2005-02-14T14:38:25-0700   Server Power On       Completed        root
4               2005-02-14T14:29:20-0700   Server Power Off      Completed        root
3               2005-02-09T13:01:35-0700   Discovery             Completed        root
2               2005-02-09T12:38:16-0700   Discovery             Completed        root
1               2005-02-09T10:32:40-0700   Discovery             Completed        root

Job ID 6 has an error and can be deleted. The delete job command is now used with the Job ID of the job to be deleted.


N1-ok> delete job 6

The show job command is used again with the all option, which lists all jobs in descending order. The deleted job no longer appears on the list.


N1-ok> show job all
Job ID          Date                       Type                  Status           Creator
7               2005-02-16T10:51:07-0700   Discovery             Completed        root
5               2005-02-14T14:38:25-0700   Server Power On       Completed        root
4               2005-02-14T14:29:20-0700   Server Power Off      Completed        root
3               2005-02-09T13:01:35-0700   Discovery             Completed        root
2               2005-02-09T12:38:16-0700   Discovery             Completed        root
1               2005-02-09T10:32:40-0700   Discovery             Completed        root


Example 5–14 Deleting All Jobs

This example shows how to delete all jobs.

First, the show job command is used with the all option, which lists all jobs in descending order.


N1-ok> show job all
Job ID          Date                       Type                  Status           Creator
7               2005-09-16T10:51:07-0700   Discovery             Completed        root
6               2005-09-14T14:42:52-0700   Server Reboot         Error            root
5               2005-09-14T14:38:25-0700   Server Power On       Completed        root
4               2005-09-14T14:29:20-0700   Server Power Off      Completed        root
3               2005-09-09T13:01:35-0700   Discovery             Running	        root
2               2005-09-09T12:38:16-0700   Discovery             Completed        root
1               2005-09-09T10:32:40-0700   Discovery             Completed        root

The delete job command is now used with the all option, to delete all jobs.


N1-ok> delete job all

Unable to delete job "3"

The show job command is used with the all option, to confirm whether all jobs were successfully deleted.


N1-ok> show job all
Job ID          Date                       Type                  Status           Creator
3               2005-09-09T13:01:35-0700   Discovery             Running	        root

Job ID 3 is still running. This is because jobs that were in a running state when the delete job command was issued must finish running, or must be stopped, before they can be deleted.

To stop the job and then delete it, first the stop job command is used with the ID of the job to be stopped.


N1-ok> stop job 3

Stop Job "3" request received.

The show job command is used to confirm that the job has been stopped.


N1-ok> show job all
Job ID          Date                       Type                  Status           Creator
3               2005-09-09T13:02:35-0700   Discovery             Aborted	        root

The job has been stopped while running and is in the aborted state. The delete job command is now used with the all option, to delete all jobs.


N1-ok> delete job all

The show job command is used to confirm that all jobs have now been deleted.


N1-ok> show job all
Job ID          Date                       Type                  Status           Creator