Sun N1 System Manager 1.2 Release Notes

Chapter 1 Sun N1 System Manager 1.2 Issues

This chapter describes the Sun N1TM System Manager 1.2 issues that are known to be problems.

What's New in N1 System Manager 1.2

Here is a list of updates and changes for the N1 System Manager 1.2 release:

Feature and Software Support Notices

This section lists the features and software that are not supported in Sun N1 System Manager 1.2 release.

Documentation Updates

This section describes known documentation updates, including documentation errata.

Installation and Configuration Guide

In the DHCP Service Conflict With N1 Grid Service Provisioning System in Sun N1 System Manager 1.2 Installation and Configuration Guide (page 37), "Sun N1 Grid Service Provisioning System" should read "Sun N1 Service Provisioning System", and "ISP" should read "OS Provisioning.”

Command Line Help

This section provides documentation errata in the command line help pages.

IPMI Credentials Require a User Name and Password (6344419)

Some parts of the command line tab help and the help pages show incorrect information about what to enter for IPMI credentials. All IPMI credential values in the command line require a user name and password pair value instead of just a password as documented. All IPMI values must be in the following syntax: user-name/password


Note –

Sun Fire V20z and V40z servers require only a password for IPMI credentials.


Hiding Passwords in the Command Line

You can enter a question mark (?) for any password attribute value if you do not want the password to display in the command line. Once you enter the command, you are prompted for the password. Examples include the rootpassword and agentssh attributes.

Specifying the force and netboot Attributes

The force and netboot attributes are documented in the command line help pages without a corresponding value. You must specify true as their values to provide a valid command, such as force=true or force true.

set user

The default role for the root user is automatically set to Admin after you reboot the management server or if you restart the N1 System Manager. It is still possible to set the root user's default role to a different role, but this is not a permanent assignment.

load group and load server

The following attributes have been added for the load group command and when specifying multiple servers with the load server command:


Note –

When installing the Red Hat 4 OS on Sun Fire X2100 servers, the bootnetworkdevice and networkdevice values must both be set to eth1. The default values do not work for this situation.


N1 System Manager Installation Issues

This section describes known N1 System Manager installation issues.

N1 System Manager Can Fail to Install on Sun Fire X4100 and X4200 Servers (6284696)

If the N1 System Manager installation process is interrupted and restarted, the N1 System Manager installation can fail in step 5, “Install OS provisioning components”. If this issue occurs, a subsequent uninstall and reinstall of the N1 System Manager will fail.

The installation log file /var/tmp/installer.log.latest shows the following after initial installation failure:

Installing Master Server ...
Error! Missing file (looked for /opt/SUNWn1sps 
  /N1_Grid_Service_Provisioning_System_5.1
  /server/postgres/postgresql.conf.in)!
print() on closed filehandle GEN0 at 
  /usr/perl5/5.8.4/lib/i86pc-solaris-64int/IO/Handle.pm line 399.
SPS install failed with exit status: 256
-----------------------------

      2k. Which port should Postgres listen on?
          (default: 5434) [1024-65535] spawn id(3) is not a tty. Not changing mode 
  at /usr/perl5/site_perl/5.8.4/Expect.pm line 375.
admin
admin
admin

      ** Invalid Input.  Enter a numeric value for the port number.

      2k. Which port should Postgres listen on?
          (default: 5434) [1024-65535] spawn id(3) is not a tty. Not changing mode 
  at /usr/perl5/site_perl/5.8.4/Expect.pm line 375.
admin
admin
admin

      ** Invalid Input.  Enter a numeric value for the port number.

      2k. Which port should Postgres listen on?
          (default: 5434) [1024-65535

The installation log shows the following after uninstall and reinstall of the N1 System Manager software:


Error!  Failed to initialize the database (exit value was 1).
Exiting..
print() on closed filehandle GEN0 at /usr/lib/perl5/5.8.0
   /i386-linux-thread-multi/IO/Handle.pm line 395.
SPS install failed with exit status: 256

Workaround: Perform the workaround procedure below that is applicable to the operating system installed on your management server. Depending on how the installation error occurred, some of the workaround steps might not complete successfully. If a workaround step does not complete successfully, go to the next step.

Solaris based Sun Fire X4100 or X4200 management server:

  1. Stop the server and agent.


    # su - n1gsps -c "/opt/SUNWn1sps/N1_Service_Provisioning_System_5.1/server/bin/cr_server stop"
    # su - n1gsps -c "/opt/SUNWn1sps/N1_Service_Provisioning_System/agent/bin/cr_agent stop"
    
  2. Uninstall service provisioning manually.


    # /opt/SUNWn1sps/N1_Service_Provisioning_System_5.1/cli/bin/cr_uninstall_cli.sh
    # /opt/SUNWn1sps/N1_Service_Provisioning_System_5.1/server/bin/cr_uninstall_ms.sh
    
  3. Remove the following packages.

    SUNWspsc1

    SUNWspsms

    SUNWspsml


    # pkgrm SUNWspsc1
    # pkgrm SUNWspsms
    # pkgrm SUNWspscl
    

    Type y in response to prompts asking “Do you want to remove this package? [y,n,?,q]”. If the message pkgrm: ERROR: no package associated with SUNWspscl appears, that package has already removed by step 2. Continue removing packages.

  4. Delete the service provisioning directory and files.


    # cd /
    # rm  -rf  /opt/SUNWn1sps/
    # rm /n1gc-setup/sps/state
    # rm /n1gc-setup/state/0installSPS.pl.state
    
  5. Reboot the management server and then install the N1 System Manager software.

Linux based Sun Fire X4100 or X4200 management server:

  1. Stop the server and agent.


    # su - n1gsps -c "/opt/sun/N1_Service_Provisioning_System_5.1/server/bin/cr_server stop"
    # su - n1gsps -c "/opt/sun/N1_Service_Provisioning_System/agent/bin/cr_agent  stop"
    
  2. Delete the service provisioning directory and files.


    # cd /
    # rm  -rf  /opt/sun/N1_Grid_Service_Provisioning_System_5.1
    # rm  -rf  /opt/sun/N1_Grid_Service_Provisioning_System
    # rm  -rf  /opt/sun/N1_Service_Provisioning_System
    # rm  -rf  /opt/sun/N1_Service_Provisioning_System_5.1
    # rm /n1gc-setup/sps/state
    # rm /n1gc-setup/state/0installSPS.pl.state
    
  3. Reboot the management server and then install the N1 System Manager software.

Security Issues

This section describes known security issues.

Browser Interface Session Never Times Out (6222506)

The event refresher frame reloads every 10 seconds, which updates the user's session timestamp. Therefore, the browser interface session will never time out.

Workaround: Explicitly log out when you are done using the browser interface.

Performance Issues

This section describes known performance issues.

Deleting Servers and Enabling/Disabling Monitoring on Servers May Take a Long Time (6344175)

The delete server and set server monitored commands may take a long time to complete. These commands do not generate a job, so you have to wait until the command finishes before you can submit another command. This is especially important to note if you try to enable/disable monitoring on a group of servers.

Discovery Issues

This section describes known discovery issues.

Hardware Model Type Information Missing From Server List After Discovering a Group of Servers (6349404)

When discovering a large group of servers (30 to 40 servers), the hardware model value does not display in the server list for some servers. For example, server 10.18.0.38 does not have the hardware information listed:


Name             Hardware    Hardware Health        Power     OS Usage    OS Resource Health
10.18.0.31       V20z        Good                   On        -             -
10.18.0.32       V20z        Good                   On        -             -
10.18.0.33       V20z        Good                   On        -             -
10.18.0.35       V20z        Good                   On        -             -
10.18.0.36       V20z        Good                   On        -             -
10.18.0.37       V20z        Good                   On        -             -
10.18.0.38       -           Good                   On        -             -
10.18.0.39       V20z        Good                   On        -             -

Workaround: Use the refresh attribute with the set server or set group command to update the servers in the list.

OS Provisioning Issues

This section describes the known OS provisioning (deployment) issues.

OS Deployment of Red Hat Linux 3.0 Update 2 Might Stop and Enter Interactive Mode

OS deployment of Red Hat Linux 3.0 Update 2 might stop and enter interactive mode due to a time out issue. This is an intermittent problem.

Workaround: Stop the OS deployment job and retry the OS deployment again. If OS deployment consistently fails, you will have to use a later version of the Red Hat OS.

Unrequired Mount Point Must Be Specified for Swap Partition When Creating OS Profile

When specifying the swap partition for an OS profile using either the browser interface or the command line, you must specify an unrequired mount point. If you specify a mount point, a separate file system is actually created.

Workaround: Specify swap as the mount point for the swap partition, which serves as a placeholder and is ignored. The following is a command line example:


add osprofile myprofile partition swap type swap size 1024 device c0t0d0s1 sizeoption fixed

Setting Baud Rate for the BIOS Console Makes OS Deployment Fail on Sun Fire V20z and V40z Servers (6322295)

The baud rate for the BIOS console must be set to 9600 (default) or OS deployment to a Sun Fire V20z or V40z server will fail. This means that you cannot change the consolebaudrate value in the load server command or the Load OS wizard in the browser interface.


Note –

If the SP console baud rate is set to something other than 9600, the OS deployment will succeed but the console through the connect server command will display garbage characters.


Workaround: You must change the baud rate for the BIOS console manually after an OS deployment. To do this, reboot the target server and enter the BIOS setup screen during the boot sequence. Consult the server's user manual to see how to change its BIOS settings.

An OS Profile Name Cannot Contain a Period When Using the Browser Interface (6331294)

If you specify a period in an OS profile name when using the browser interface wizard to create an OS profile, an "invalid profile name" error occurs. A period should be an acceptable character for an OS profile name.

Workaround: If you want to include a period in the OS profile name, use the create osprofile command.

The Same SUSE OS Profile Cannot Be Used on Sun Fire V20z/V40z Servers and Sun Fire X4000 Series Servers (6344382)

Once you load a SUSE OS profile on a Sun Fire X4000 series server, that same OS profile and associated OS distribution cannot be used on Sun Fire V20z and V40z servers. Loading a SUSE OS profile on a Sun Fire X4000 series server actually modifies the associated SUSE OS distribution, which makes that OS distribution unusable by Sun Fire V20z and V40z servers.

Workaround: You must create separate SUSE OS distributions and OS profiles for Sun Fire V20z/V40z servers and Sun Fire X4000 series servers.

Interface Issues

This section describes the known browser interface and command line interface issues.

Incorrect Server Details Are Displayed When Servers Swap Management IP Addresses (6196399)

If the management IP addresses of two discovered servers are swapped, the detailed server information displayed for each of the servers with the swapped addresses will be the information for the other swapped server. For example, if server A and server B have their management IP addresses swapped, “show server A” will show server B's information and “show server B” will show server A's information.

Workaround: Delete both of the servers with swapped IP addresses and then rediscover them. This will result in any user supplied information about the server being lost.

Browser Interface Becomes Out of Sync if Back Button Is Used (6215298)

The browser interface uses frames that are synchronized. If you click the browser's Back button in one of the frames, the frames can get out of sync.

Workaround: Press F5 or refresh the page to synchronize the frames.

The Bottom Line of the Serial Console Message Is Not Shown in the Console Window (6308148)

The last line of the serial console launched from the N1 System Manager browser interface is not displayed in the serial console window.

Workaround: Press Enter or Return to display the last line.

Only SSHv1 Is Supported for Serial Console Access (6309107)

The applet used for serial console access from the Web Browser interface uses SSHv1 only for communication back to the N1 System Manager management server. This feature requires enabling SSHv1 for the N1 System Manager management server.

Workaround: If you do not want to enable SSHv1 and the serial console Web Browser interface, you can use the serial console feature from the n1sh command line interface.

Serial Console Feature Requires Sun JavaTM Plugin 1.4.2 or Later Installed (6315615)

To use the Serial Console feature from the Server Details page in the browser interface, the Sun Java Plugin 1.4.2 or later must be installed on the system where you are running the browser. Not all of the supported browsers for the N1 System Manager have this installed.

Incorrect Swap Information Is Reported for Sun Fire X4100 and Sun Fire X4200 Servers with Firmware Level 6464 (6344709)

The browser interface server details and the show server command displays the wrong swap information for Sun Fire X4100 and Sun Fire X4200 Servers with Firmware Level 6464 and the Red Hat Operating System.

Workaround: Use the serial console to access the server and find out the correct swap information by using the top command.

Firmware Update Issues

This section describes known firmware update issues.

N1 System Manager Allows Deployment of Incompatible Firmware to Dual-Core Sun Fire V20z and Sun Fire V40z Servers (6296404)

Dual-core Sun Fire V20z and Sun Fire V40z servers require a 2.3.x and greater firmware revision. N1 System Manager does not prevent you from deploying firmware revisions below 2.3.x. Deploying firmware revisions below 2.3.x may result in issues with the server's service processor.

Workaround: Double check the firmware revision before updating.

Inadequate Error Message When Firmware Update Fails on ALOM-based Servers (6330195)

If the ftp service is not enabled on the management server, firmware updates on ALOM-based servers fail with the following error message in the job output:


An exception occurred trying to update server-name. Please refer to the log file for more information.

Workaround: Enable the ftp service on the management server. See Enabling FTP on the Management Server in Sun N1 System Manager 1.2 Site Preparation Guide for details.

OS Update Issues

Installing a Linux OS Update Created From a Solaris Package Outputs Inadequate Error Messages (6230630)

The create update command allows you to create a Linux OS update from a Solaris package, even though this is not a valid procedure. If you happen to do this and then try to install the OS update on a Linux system, the Update job is accepted but it eventually fails with error messages that do not help diagnose the underlying problem.

Workaround: Make sure that the update is compatible with the installed OS. You can view a provisionable server's OS by using the show server command, and you can view the OS type for an OS update by using the show update command.

Failed Solaris OS Update May Cause Subsequent OS Update Failures (6310032)

If a Solaris OS update installation fails, a copy of the admin file used for the installation is not removed from the provisionable server. If the failure was due to a corrupt or invalid admin file, subsequent OS update installations will not replace the faulty admin file and it may cause continued failures.

Workaround: Delete the package-filename.admin file in the provisionable server's /tmp directory and retry the OS update installation. If you specified a customized admin file for the OS update, ensure that the admin file is valid.

The create update Command Fails When Copying Solaris Packages and Patches From a URL (6324124)

The create update command does not work if you specify a valid Solaris package or patch through a URL (http://). An error similar to the following is displayed:


# ./n1sh create update sol file http://10.11.1.35/scs/SVR4/SCSFpoppl.pkg ostype solaris10x86
File "http://10.11.1.35/scs/SVR4/SCSFpoppl.pkg" exists but is not a valid update file.

Workaround: You must first download the package or patch to a location that is accessible from the management server and then specify a fully qualified path to the package or patch.

Monitoring Issues

Apostrophe Cannot Be Used in Notification Description (6242713)

The create notification command fails if an apostrophe is used for the description attribute.

Workaround: Escape the apostrophe with another apostrophe (for example, Support”s Notification) or do not use an apostrophe in the description.

Clock Icon Representing Running Jobs Remains After Jobs Finish (6258571)

Even after all jobs are finished running, the clock icon next to the servers in the View Selector section may still display, which is a problem with the refresh feature.

Workaround: Click the Refresh button or press F5 to refresh the browser interface.

A Misleading Create OS Job Status After Running Out of Disk Space on Management Server (6299790)

If a Create OS job is running and the management server runs out of disk space, the job status shows “running”. When disk space is cleaned up, and the N1 System Manager is restarted, the job status changes to “complete” even though the Create OS job has failed.


Note –

The failed job's state will remain shown as “complete” and cannot be corrected.


Workaround:

  1. Free up at least 3 Gbytes of disk space.

  2. Stop and restart the N1 System Manager.

  3. Resubmit the Create OS operation.

Jobs That Are Queued But Not Running Are Shown in the Job Detail as “Not Started” (6318398)

When the total job load is high enough to prevent the next job in the queue from running, the Job Details screen shows the running jobs' status as “running”, and the status for other jobs is shown as “Not Started”. The queued jobs will run after one or more of the running jobs have completed and the total job load is low enough to allow the next job in the queue to run.

See Job Queueing in Sun N1 System Manager 1.2 Administration Guide for further information.

Memory Threshold Values Are Not Updated Properly When OS Monitoring Agents are Upgraded by the agentupgrade Script (6330911)

If you upgrade the OS monitoring agents to the 1.2 release by using the agentupgrade script, the following memory threshold values are not updated properly:

This issue may put the associated provisionable servers in a “Failed Critical” state.

Workaround: After you upgrade the OS monitoring agents, use the set server or set group commands to change the threshold values on the servers.

Email Notifications Stop Being Sent After an OS Monitor Threshold Is Exceeded (6347039)

If an OS monitor threshold is exceeded, email notifications may stop being sent. This is an intermittent problem.

Workaround: Perform the following procedure on the management server.

  1. Shutdown N1 System Manager.


    # /etc/init.d/n1sminit stop
    
  2. For a Red Hat management server, change directory to /etc/opt/sun/cacao/modules.


    # cd /etc/opt/sun/cacao/modules
    

    For a Solaris management server, change directory to /etc/opt/SUNWcacao/modules.


    # cd /etc/opt/sun/SUNWcacao/modules
    
  3. Backup the XML files that need to be updated.


    # cp coreservicemodule.xml coreservicemodule.xml.save
    # cp com.sun.hss.domain.xml com.sun.hss.domain.xml.save
    
  4. Remove the following lines from the coreservicemodule.xml file.

    <path-element>
         file:/opt/sun/n1gc/lib/activation.jar
       </path-element>
    <path-element>
         file:/opt/sun/n1gc/lib/mail.jar
       </path-element>
  5. Remove the following lines from the com.sun.hss.domain.xml file.

    <path-element>
         file:/opt/sun/n1gc/lib/mailapi.jar
       </path-element>
       <path-element>
         file:/opt/sun/n1gc/lib/imap.jar
       </path-element>
       <path-element>
         file:/opt/sun/n1gc/lib/pop3.jar
       </path-element>
       <path-element>
         file:/opt/sun/n1gc/lib/smtp.jar
       </path-element>
  6. Add the following line to the com.sun.hss.domain.xml file.

    <path-element>
         file:/opt/sun/n1gc/lib/mail.jar
       </path-element>
  7. Start the N1 System Manager.


    # /etc/init.d/n1sminit start
    

Unable to Change a Server's agentssh Values (6347588)

Changing a server's agentssh value by using the set server command does not work.

Workaround: If you want to change a server's agentssh value, you must delete the server, discover it again, and use the add server command to set the agentssh value.

Cannot Set the fsusage.pctused Threshold on a Group of Servers (6347647)

Setting the fsusage.pctused threshold using the set group command does not work. For example:


N1-ok> set group test-systems threshold fsusage.pctused warninghigh=70 criticalhigh=80
Invalid value "fsusage.pctused".

Workaround: You must set the fsusage.pctused threshold on a single server by using the set server command. You can create an n1sh customized script to help automate this procedure for a large number of servers. See To Run a Script of N1 System Manager Commands in Sun N1 System Manager 1.2 Administration Guide for more information.

Resetting a Provisionable Server Makes the OS Health State Invalid Until the Server Information Is Refreshed (6351266)

After you reset (reboot) a provisionable server using the reset command or the Reset menu item in the browser interface, the server's OS resource health state changes to “Failed critical” until a refresh occurs after five to ten minutes. This happens even though the server's OS health state is good.

Workaround: After you reset a provisionable server, you can refresh the server using the set server refresh command, or wait five to ten minutes for the server's status to refresh automatically.

Localization Issues

Non-ASCII Objects Display Random Characters if the N1 System Manager Is Running in a Non-UTF8 Locale (6231209)

Non-ASCII objects created using the N1 System Manager display random characters if you start N1 System Manager in following ways:

Workaround: Use either of the two following methods.

  1. Temporary solution: set the LANG environment variable to the UTF8 locale on the management server and restart the N1 System Manager. For example:


    # export LANG en_US.UTF-8
    # /etc/init.d/n1sminit stop
    # /etc/init.d/n1sminit start
    
  2. Permanent solution:

    • On a Solaris based management server:

      Edit the file /etc/default/init and change the LANG value to en_US.UTF-8.

    • On a Linux based management server:

      Edit the file /etc/sysconfig/i18n and change the LANG value to en_US.UTF-8.

Cannot Install ALOM Firmware With a Non-ASCII Firmware Name (6297238)

The load server command fails to install ALOM firmware if the firmware name is non-ASCII.

Workaround: Change the firmware name to ASCII using the set firmware command.

Internationalization Features Are Not Supported for the n1sh Command on Solaris Management Servers (6297808)

The Python version (2.3) on a default Solaris management server does not provide adequate internationalization support for the n1sh command.

Workaround: Install Python 2.4 or later on the Solaris management server. The Python executable must be /usr/bin/python2.4.

Deploying Solaris 10 With Some Installation Languages Will Time Out (6178721, 6179110)

If you deploy Solaris 10 with an OS profile that has a particular installation language set, the installation is performed in interactive mode and you must select a language when prompted. The deploy OS job will eventually time out if you do not make the language selection. The following languages create this behavior:

Workaround: Because the installation is no longer automated, you must monitor the deployment through the server's serial console and make the language selection. You can choose Serial Console from the Actions menu in the browser interface or use the connect server command.

Powering Off/On a Server Running a Non-English Locale Makes the Server's Hardware Health Status “Failed Critical” (6343747)

This problem occurs if an ALOM-based, Solaris provisionable server is running Non–English locale and you power on and off the server using the stop and start commands, respectively.

Workaround: Use the reset server command on the server.

A Solaris SPARC Provisioning Server Running a Non-English Locale Does Not Display Package Information (6350202)

If a Solaris SPARC provisioning server is running a non-English locale, the package information does not display in the server details output.

Workaround: Edit the file /etc/default/init on the provisionable server, change the LANG value to en_US.UTF-8, and reboot the server.

Adding OS Monitoring Agents to a Server Fails When Running a Non-English Locale (6351553)

When adding OS monitoring agents to a provisionable server, there are two situations that fail because the server is running a non-English locale.

Workaround: