Sun N1 System Manager 1.3 Discovery and Administration Guide

Managing Jobs

This section describes jobs and their integral role in of server monitoring.

Each major action you take in the N1 System Manager starts a job. Use the job log to track the status on a currently running action or to verify that a job has finished. Monitoring jobs is useful particularly because some N1 System Manager actions can take a long time to finish. An example of such an action is installing an OS distribution on one or more managed servers.

You can track jobs through the Jobs tab in the browser interface or the show job command. The show job command provides information about most of the following characteristics:

Job ID

Generated unique identifier.

Date

Date on which the job was started.

Job Type

Type of job. See show job in Sun N1 System Manager 1.3 Command Line Reference Manual for details. When using the show job command with the type parameter, jobs can be any of the following types:

  • addbase Add base management support.

  • addosmonitor Add OS monitoring support.

  • createos Create OS distribution from CD/DVD media or ISO files.

  • deletejob Delete job.

  • discover Server discovery.

  • loadfirmware Load firmware update.

  • loados Load OS.

  • loadupdate Load OS update.

  • refresh Server refresh.

  • reset Server reboot.

  • removeosmonitor Remove OS monitoring support.

  • removeserver Server deletion.

  • setagentip Modify management feature configuration. Related to the base management and OS monitoring features.

  • start Server power on.

  • startcommand Remote command execution.

  • stop Server power off.

  • unloadupdate Unload OS update.

State

State of the current job step. Job steps indicate the progress of a job and update results. Each job step has a type, a start time and, when the job completes, a completion time. For the purposes of filtering, job progress is indicated with the following states:

notstarted

Jobs in a notstarted state cannot be stopped.

preflight

When you select a job by ID and view the details of that job, each step of that job can appear twice:the preflight check and the execution of the step itself.

running

The job is currently running. Jobs that are currently running cannot be deleted using the delete job command. Jobs that are currently running must finish running or be stopped using the stop job command.

Job completion is indicated with the following results:

completed

Indicates that the job step completed successfully.

warning

Indicates a warning during the job execution. A warning can be an issue reported that might be severe enough to terminate the job step, and the job, with errors.

stopped

Indicates that the job step stopped before it completed.

pendingstop

Indicates that the job is still running but that the job step cannot complete successfully.

error

Indicates a general error in that job step.

timed_out

Indicates that the job timed out before all of the job steps could complete successfully, or that the next step of the job started before the current step completed successfully.

Complete - Warning is issued in the output for an overall job status, if the job successfully completed all of its steps one or more WARNING states were issued for steps during the job execution and these warnings were not severe enough to terminate the job with errors.

You can filter jobs depending on their state. See show job in Sun N1 System Manager 1.3 Command Line Reference Manual for details.

Command

The command that was used to start the job.

Owner

The user who started the job. Also called the job creator.

Job Results

Provides details about the results of a completed job. You can review the standard output of remote command operations and completion statuses for all other job types.

ProcedureTo List Jobs

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. View the list of jobs.


    N1-ok> show job all
    

    A list of all jobs for the N1 System Manager is returned.

    See show job in Sun N1 System Manager 1.3 Command Line Reference Manual for details.


Example 6–12 Listing All Jobs

This example shows that using the show job command with the all option returns a list of jobs by Job ID, together with the date and time at which the job was started. The job type and status are also returned, along with the identity of the user who created the job.


N1-ok> show job all
Job ID          Date                       Type                  Status        Owner
7               2005-09-16T10:51:07-0700   Discovery             Completed      root
6               2005-09-14T14:42:52-0700   Server Reboot         Error          root
5               2005-09-14T14:38:25-0700   Server Power On       Completed      root
4               2005-09-14T14:29:20-0700   Server Power Off      Completed      root
3               2005-09-09T13:01:35-0700   Discovery             Completed      root
2               2005-09-09T12:38:16-0700   Discovery             Completed      root
1               2005-09-09T10:32:40-0700   Discovery             Completed      root

ProcedureTo View a Specific Job

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. View a specific job.


    N1-ok> show job job
    

    Detailed information about the job appears in the output.


Example 6–13 Viewing Job Details

This example shows that using the show job command with the Job ID returns the date and time at which the job was started, the job type and status, and the identity of the user who created the job. The job in this example is to load an OS profile on a server named 192.168.200.4 using the load server command. Further details are provided for each step of that job, including the time at which the step started and completed and whether the step was successful.


N1-ok> show job 21
Job ID:   21
Date:     2005-10-27T10:09:18-0600
Type:     Load OS
Status:   Completed (2005-10-27T10:37:23-0600)
Command:  load server 192.168.200.4 osprofile SLES9RC5 
bootip=192.168.200.30 networktype=static ip=192.168.200.31
Owner:    root
Errors:   0
Warnings: 0

Steps
ID     Type             Start                      Completion                 Result   
1      Acquire Host     2005-10-27T10:09:19-0600   2005-10-27T10:09:19-0600   Completed
2      Execute Java     2005-10-27T10:09:19-0600   2005-10-27T10:09:19-0600   Completed
3      Acquire Host     2005-10-27T10:09:21-0600   2005-10-27T10:09:21-0600   Completed
4      Execute Java     2005-10-27T10:09:21-0600   2005-10-27T10:37:22-0600   Completed

Results
Result 1: 
Server:   192.168.200.4
Status:   0
Message:  OS deployment using OS Profile SLES9RC5 was successful.
IP address 192.168.200.30 was assigned.


Example 6–14 Viewing all OS Monitoring Jobs

This example shows how to use the show job command with the addosmonitor Job Type to filter all jobs that add OS monitoring support.


N1-ok> show job type addosmonitor

ProcedureTo Stop a Job

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. Stop a specific job.


    N1-ok> stop job job
    

    The job is stopped.

    See stop job in Sun N1 System Manager 1.3 Command Line Reference Manual for details.

  3. View the job details.


    N1-ok> show job job
    

    The Result section of the output shows that the job was stopped.

    Any job can be stopped. In practice, however, only a job that is not in its last step can be stopped. Some jobs only have one step and so can never be stopped. Jobs in a notstarted state cannot be stopped. Operations that are performed on large groups of servers can take longer and might include a large number of steps.

    See show job in Sun N1 System Manager 1.3 Command Line Reference Manual for details.


Example 6–15 Stopping a Job

This example shows that using the stop job command with the Job ID returns a message confirmed that the request has been received.


N1-ok> stop job 32

Stop Job "32" request received.

This example also shows that the show job command can be used with the Job ID of the job that was stopped to gain more data about the job that was stopped. The command returns the confirmation, in Status, that the job was stopped, and the command that was used to create the job. Further details are provided for each step of that job, including the time at which the step started and completed and whether the step was successful. The Result section shows that the job was stopped.


N1-ok> show job 32
Job ID:   32
Date:     2005-11-02T08:08:37-0700
Type:     Server Refresh
Status:   Stopped (2005-11-02T08:08:48-0700)
Command:  set server 192.168.200.2 refresh
Owner:    root
Errors:   0
Warnings: 0

Steps
ID   Type           Start                      Completion                 Result   
1    Acquire Host   2005-11-02T08:08:38-0700   2005-11-02T08:08:38-0700   Completed
2    Run Command    2005-11-02T08:08:38-0700   2005-11-02T08:08:38-0700   Completed
3    Acquire Host   2005-11-02T08:08:40-0700   2005-11-02T08:08:40-0700   Completed
4    Run Command    2005-11-02T08:08:40-0700   2005-11-02T08:08:47-0700   Stopped

See Also

To Issue Remote Commands on a Managed Server or a Group

ProcedureTo Delete a Job

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. Determine the job you want to delete.


    N1-ok> show job all
    

    All jobs and job IDs appear in the output.

    See show job in Sun N1 System Manager 1.3 Command Line Reference Manual for details.

  3. Delete the desired job.


    N1-ok> delete job job
    

    The job is deleted.

    See delete job in Sun N1 System Manager 1.3 Command Line Reference Manual for details.

  4. Verify that the job was deleted.


    N1-ok> show job all
    

    The deleted job should not appear in the output.

    See show job in Sun N1 System Manager 1.3 Command Line Reference Manual for details.


Example 6–16 Deleting a Job

This example shows how to delete a job.

First, the show job command is used with the all option, which lists all jobs in descending order.


N1-ok> show job all
Job ID     Date                       Type                Status        Creator
7          2005-02-16T10:51:07-0700   Discovery           Completed     root
6          2005-02-14T14:42:52-0700   Server Reboot       Error         root
5          2005-02-14T14:38:25-0700   Server Power On     Completed     root
4          2005-02-14T14:29:20-0700   Server Power Off    Completed     root
3          2005-02-09T13:01:35-0700   Discovery           Completed     root
2          2005-02-09T12:38:16-0700   Discovery           Completed     root
1          2005-02-09T10:32:40-0700   Discovery           Completed     root

Job ID 6 has an error and can be deleted. The delete job command is now used with the Job ID of the job to be deleted.


N1-ok> delete job 6

The show job command is used again with the all option, which lists all jobs in descending order. The deleted job no longer appears on the list.


N1-ok> show job all
Job ID     Date                       Type               Status        Creator
7          2005-02-16T10:51:07-0700   Discovery          Completed     root
5          2005-02-14T14:38:25-0700   Server Power On    Completed     root
4          2005-02-14T14:29:20-0700   Server Power Off   Completed     root
3          2005-02-09T13:01:35-0700   Discovery          Completed     root
2          2005-02-09T12:38:16-0700   Discovery          Completed     root
1          2005-02-09T10:32:40-0700   Discovery          Completed     root


Example 6–17 Deleting All Jobs

This example shows how to delete all jobs.

First, the show job command is used with the all option, which lists all jobs in descending order.


N1-ok> show job all
Job ID     Date                       Type               Status        Creator
7          2005-09-16T10:51:07-0700   Discovery          Completed     root
6          2005-09-14T14:42:52-0700   Server Reboot      Error         root
5          2005-09-14T14:38:25-0700   Server Power On    Completed     root
4          2005-09-14T14:29:20-0700   Server Power Off   Completed     root
3          2005-09-09T13:01:35-0700   Discovery          Running       root
2          2005-09-09T12:38:16-0700   Discovery          Completed     root
1          2005-09-09T10:32:40-0700   Discovery          Completed     root

The delete job command is now used with the all option, to delete all jobs.


N1-ok> delete job all

Unable to delete job "3"

The show job command is used with the all option, to confirm whether all jobs were successfully deleted.


N1-ok> show job all
Job ID     Date                       Type             Status     Creator
3          2005-09-09T13:01:35-0700   Discovery        Running    root

Job ID 3 is still running. This is because jobs that were in a running state when the delete job command was issued must finish running, or must be stopped, before they can be deleted.

To stop the job and then delete it, first the stop job command is used with the ID of the job to be stopped.


N1-ok> stop job 3

Stop Job "3" request received.

The show job command is used to confirm that the job has been stopped.


N1-ok> show job all
Job ID     Date                       Type             Status        Creator
3          2005-09-09T13:02:35-0700   Discovery        Aborted       root

The job has been stopped while running and is in the aborted state. The delete job command is now used with the all option, to delete all jobs.


N1-ok> delete job all

The show job command is used to confirm that all jobs have now been deleted.


N1-ok> show job all
Job ID     Date                      Type              Status        Creator

Job Queueing

Each type of job in the N1 System Manager has a weight associated with it. The weight is a reflection of the load created by the job on the system resources. A global limit governs how much total load can be placed on the system. The following table provides a listing of the weight for each type of (user level) job. The maximum load permitted is 1000.

Table 6–4 Job Weight Values

Job 

Weight 

OS Deployment 

500 

Package Deployment 

500 

Package Uninstall 

500 

Discovery 

200 

Firmware Deployment 

500 

Remote Command Execution 

200 

Job Deletion 

400 

Create OS 

1000 

Reset Server 

200 

Server Power Off 

200 

Server Power On 

200 

Server Refresh 

200 

Set Server Feature 

200 

Remove Server 

100 

Add Server 

100 

The total load is the sum of the loads of all the current running jobs. The system will compare the current total load with the maximum permitted load at the following points in time:

If the difference between the current total load and the maximum permitted load is great enough to accommodate the job at the head of the job queue, then that job is promoted to a running state. Otherwise, it is left in the queued state. The current total load governs the permissible concurrent running job mix within the system.

For example, only two OS Deployment jobs can be running at one time:

500 + 500 = 1000

Or only one OS Deployment job and two Server Power Off jobs can be running at one time:

500 + 200 + 200 < 1000