N1 Grid Service Provisioning System User's Guide and Release Notes for the OS Provisioning Plug-In 1.0

Chapter 2 Release Notes for OS Provisioning Plug-In

This chapter describes late-breaking news and known issues with the OS provisioning plug-in.

The chapter contains the following information:

Installation Issues

There are no known installation issues.

Runtime Issues

The following issues are known to exist when provisioning operating systems.

Solaris: Wrong Encryption of root Password Causes JumpStart Error (6245964)

Description: You see the following messages during installation and the installation becomes interactive:


root_password=Clz6pK2b6qw=
	syntax error  line 2 position 15

The password variable sysidcfg_root_password_base_conf in the Solaris OS profile has a encrypted value. However, the password that you supplied was not a Solaris-encrypted password..

Workaround: Use the Solaris tools to encrypt the password. The appropriate mechanism for a user to create an encrypted password is to create a user with a password. Look in the /etc/shadow file for the encrypted password and use it in as a value for the sysidcfg_root_password_base_conf variable.

Solaris: RA Installer Uses Value of HostName Instead of Target sps_ra_host (6255081)

Description: If the OS profile has the value install_ra_from_snapshot_spsra="n", the N1 Grid SPSRemote Agent (RA) listens on the IP address specified by the host name rather than the IP address specified in the variable sps_ra_host. In this case, when the RA starts, the transport.config uses the host name of the system rather than the value specified in sps_ra_host.

Workaround: Create and use a snapshot rather than the RA installer. Follow these steps:

  1. Create an RA snapshot. For example, on a Solaris x86 system on which the N1 Grid SPS 5.0 RA is installed, use these commands:


    #cd /opt/SUNWn1sps/N1_Grid_Service_Provisioning_System
      #cat > /tmp/exclude
      ./agent/data
      <control-D EOF>
      #tar cvf /tmp/exclude ./agent ./common > /tmp/sps_ra_solaris_x86_5.0.tar
  2. Import the snapshot to the JET server, as explained in How to Import N1 Grid SPS RA Installers.


    Note –

    The media path should point to a directory reachable from JET Server that contains the tar file that you created in the previous step.


Unable to Change Location of OS Provisioning Scripts on Windows Boot and Install Server (6251010)

Description: You cannot change the OS provisioning script location for a Windows boot and install server once the Windows boot and install server is created.

Workaround: Recreate a new Windows boot and install server that has a different name.

Solaris 10 Installation on Target Host Requires User Response (6245773)

Description: If a time server cannot be reached during installation, the installation becomes interactive. During installation, if the time server was not specified, the sysidcfg file uses the “Solaris 10 Jet Server” as the time server. On Solaris 10 systems, the boot and install servers do not start the time services by default.

Workaround: There are two ways to resolve this issue.

Warnings for DHCP Settings Are Not Reported to the User Interface (6248485)

Description: The provisioning operation fails because the DHCP settings are incorrect. There is no message shown in stdout or stderr.

Workaround: The incorrect settings cause the OS provisioning subnet to be created with wrong values. Look at the /var/adm/n1osp* log files on the OS provisioning server for the DHCP error.

JET Attach Fails Because osp_pkgchk.sh File Does Not Exist (6257748)

Workaround: Install the jet utilities tar file on the JET server physical host prior to the JET server attach.

    In the N1 Grid SPS browser interface, follow these steps:

  1. In the Common Tasks section of the N1 Grid SPS browser interface, select OS Provisioning.

  2. On the OS Provisioning Common Tasks page, click Manage in the JET Solaris Image Servers section.

  3. Select the referenced component /com/sun/n1osp/resource/jet_util.tar.

  4. On the Component Details page, click the Run action next to the default:install procedure.

  5. Select the physical target host name on which the JET server is installed.

  6. Click Run Plan (includes preflight).

Cannot View OS Installation Log by Host Provision Status in Non-EUC Locale (6255797)

Description: Installation log files are always in related EUC locale regardless of the specified locale for the OS installation. When the remote agent locale is different from this EUC locale, you cannot view the log file correctly through the Status Monitoring page because the locales do not match.

Workaround: Connect to service port or console (if applicable) with proper locale to view the log files directly.

Troubleshooting

General Troubleshooting Guidelines

Problem:

Provision plan ran successfully, but provisioning on target failed.

Solution:

There could be several problems. The paragraphs below list some of the possible reasons why provisioning might have failed. Use this list to isolate the problem.

  1. Look into the provisioning logs by viewing the Host Status information in the N1 Grid SPS browser interface. Run the Host Status plan on the target. Check both stdout and stderr to see the reason for failure.

  2. Log in to the OS provisioning server and check for logs in the /var/run/n1osp/log folder and also check the console output in the /var/run/n1osp/console folder. Also, check the messages in /var/adm/n1osp* files. To view more detail in the /var/adm/n1osp* files, change the value of n1.isp.core.debuglevel property in the /opt/SUNWn1osp/etc/n1osp.properties file. For example, n1.isp.core.debuglevel=25.

  3. Obtain a console to the target and re-provision to see reason for failure.

  4. Check for network connectivity between the OS provisioning sever, the boot and install server, and the target host. ping/snoop for packets between the OS provisioning sever and the boot and install server, between the OS provisioning server and the target host, and between the boot and install server and the target host.


    Note –

    If the OS provisioning sever or boot and install server has several IP addresses, use the addresses that are used for provisioning.


  5. DHCP packets from target are not reaching the OS provisioning sever. Check if the target has been configured to boot over the network using DHCP. If the target host has several network interfaces, ensure that it is using the interface specified in the host profile to boot and install the operating system. Re-provision the target and check if the /etc/dhcpd.conf file on OS provisioning server has entries for the target host. The DHCP is configured to respond to target only for the duration of OS provisioning, so you need to re-provision the target to see if the DHCP has been configured properly.

  6. Check if the Solaris, Linux, and Windows boot and install servers have been set up properly. Read the appropriate OS documentation for more details. Verify that the boot and install server is properly configured to share the OS media using NFS (for Solaris and Linux) or CIFS (for Windows). Verify that the IP addresses used in the OS profile and the IP addresses configured on the boot and install server match. Check if the TFTP services are configured to run on boot and install server.

  7. Check the OS profile information and Host Profile information for IP addresses, Passwords and other information.

  8. If an OS fails to install or gets hung, check if your OS profile has the necessary drivers to boot the targets over the network. See the respective OS documentation for more details.

Problem:

Error while creating profiles, creating hosts, or provisioning targets.

Solution:

Errors can occur at several points in the provisioning process. Check the following:

  1. Verify that the N1 Grid SPS Remote Agents (RAs) are installed correctly on the boot and install servers. Verify that the Master Server can reach the RAs. For more information, see the N1 Grid Service Provisioning System 5.0 Installation Guide.

  2. Verify that the N1 Grid SPS command-line interface (CLI) is installed on the OS provisioning server and the Solaris boot and install server. Run a simple cr_cli command.

  3. Check the stdout and stderr of the plan.

  4. Verify that valid values are provided for the plan and component variables.

Solving Solaris-Related Problems

Problem:

I do not understand the sequence of operations for provisioning the Solaris operating system.

Solution:

The sequence for Solaris is as follows:

  1. The JET server/Solaris boot and install server is prepared for the target host.

  2. The DHCP on OS provisioning sever is setup for the target host.

  3. The target host is rebooted to boot over network using DHCP.

  4. The target host broadcasts DHCP discover packets.

  5. The DHCP server on OS provisioning server sends DHCP offer.

  6. The target host broadcasts DHCP request packets.

  7. The DHCP server on OS provisioning server sends DHCP ACK.

  8. The target uses TFTP protocol to get the boot kernel from the JET server.

  9. The target installs the OS by getting files over NFS from the JET server.

Problem:

While importing a Solaris image, the plan times out.

Solution:

Set the default time out for plans on the Master Server. Follow these steps:

  1. Edit the following configuration file:


    /opt/SUNWn1sps/N1_Grid_Service_Provisioning_System_5.0/server/config/config.properties
  2. Set the following properties:

    pe.nonPlanExecNativeTimeout=12000
    pe.defaultPlanTimeout=12000
  3. Restart the Master Server.


# cr_server stop
# cr_server start
Problem:

While importing a Solaris image, the plan fails.

Solution:

Follow these steps to analyze the problem:

  1. Check the stdout and stderr messages of the plan.

  2. Verify that there is enough disk space to hold the media.

  3. Check the values for the variables. Make sure that all paths are correct and complete.

Problem:

Provision plan fails indicating failure in spsra module

Solution:

The N1 Grid SPS RA is installed on the target by using either the snapshot of the N1 Grid SPS RA installed on the JET server or by using an RA distribution. Check the install_ra_from_snapshot_spsra value in the OS profile. If the target host and the JET server are of different architecture, then you must set install_ra_from_snapshot_spsra to “n” and install the N1 Grid SPS RA on the JET server using the "Jet" component.

Problem:

While provisioning Solaris x86, installation becomes interactive.

Solution:

Ensure that the console variable x86_console_base_config in the OS profile is properly configured. For v20z targets, this should be ttya. If the installation fails indicating that the boot partition size is small, this is most likely because another OS was previously installed that uses a different disk label format. Use the fdisk utility to repartition the disk.

Solving Linux-Related Problems

Problem:

I do not understand the sequence of operations for provisioning the Linux operating system.

Solution:

The sequence for Linux is as follows:

  1. The Linux boot and install server is prepared for the target host.

  2. DHCP on OS provisioning sever is set up for the target host.

  3. The target host is rebooted to boot over the network using DHCP.

  4. The target host broadcasts DHCP discover packets.

  5. The DHCP server on the OS provisioning server sends DHCP offer.

  6. The target host broadcasts DHCP request packets.

  7. The DHCP server on the OS provisioning server sends DHCP ACK.

  8. The target uses TFTP protocol to get the boot kernel from the Linux boot and install server.

  9. The target installs the OS by getting files over NFS from the Linux boot and install server.

Problem:

Installation starts, but user is prompted that the disk label could not be read.

Solution:

This problem most likely indicates that another OS was previously installed that uses a disk label format that Linux did not recognize as the default for the architecture. To force the installer to re-initialize the disk label to the default architecture without prompting the user, add the --initlabel option to the clearpart directive of the kickstart configuration file.

Problem:

Installation cannot get IP address through DHCP.

Solution:

Try the following solutions:

Problem:

Target gets the DHCP packet, but fails to boot.

Solution:

Try the following solutions:

Problem:

You see the following message on the console:


VFS: mounted root (ext2) filesystem
Solution:

The Linux kernel has redirected the console elsewhere. Change the console settings in the PXE configuration file.

Problem:

Installation goes into interactive mode.

Solution:

Check kickstart file for errors. Verify that the server IP address and paths are correct and complete.

Solving Windows-Related Problems

Problem:

I do not understand the sequence for provisioning the Windows operating system.

Solution:

The sequence for provisioning Windows is as follows:

  1. The Windows boot and install server is prestaged for the target host in the active directory.

  2. DHCP on OS provisioning sever is set up for the target host.

  3. The target host is rebooted to boot over network using DHCP.

  4. The target host broadcasts DHCP discover packets.

  5. The DHCP server on OS provisioning server and BINL on the Windows boot and install server sends DHCP offers.

  6. The target chooses DHCP offer from OS provisioning server and broadcasts DHCP request packets.

  7. The DHCP server on OS provisioning server sends DHCP ACK.

  8. The target broadcasts again for DHCP discover (for PXE boot server).

  9. The BINL on the Windows boot and install server sends DHCP offer (for PXE).

  10. The target uses the next server information in the DHCP packet and does a TFTP to get the boot kernel from the Windows boot and install server.

  11. The target goes through text mode installation by getting files over CIFS from the Windows boot and install server.

  12. The target reboots.

  13. By this time the DHCP server is cleared to not respond to the target host, so the target boots from the disk.

  14. The target goes through the GUI mode installation.

  15. The target reboots and runs the scripts in GuiRunOnce section of the SIF file.

Issues Related to PXE/DHCP/BINLSVC

Problem:

How do I know I have the correct PXE ROM version?

Solution:

When the NetPC or client computer ROM-boots, a PXE (LSA) ROM message appears on the screen. You can see which version of the PXE ROM code is displayed during the boot sequence of the client machine. Windows 2000 RIS supports .99c or greater PXE ROMs. You may be required to obtain a newer version of the PXE-based ROM code from your OEM if you are not successful with this existing ROM version.

Problem:

How do I know if the client computer has received an IP Address and has contacted the Remote Installation Server?

Solution:

When the client computer boots, the PXE Boot ROM begins to load and initialize. The following four-step sequence occurs with most Net PC or PXE ROM-based computers:


Note –

The sequence may be different on your computer.


  1. The client computer displays the message BootP. This message indicates the client is requesting an IP address from the DHCP server.

    Troubleshooting: If the client does not get past the BootP message, the client is not receiving an IP address. Check the following possibilities:

    • Is the DHCP server available and has the service started? DHCP and RIS servers must be authorized in the Active Directory for their services to start. Check that the service has started and that other non-remote boot-enabled clients are receiving IP addresses on this segment.

    • Can other client computers, such as (non-remote boot-enabled clients, receive an IP address on this network segment?

    • Does the DHCP server have a defined IP address scope and has it been activated? To verify this feature, click Start, point to Programs, point to Administrative Tools, and click DHCP. Alternatively, you can click Start, point to Programs, point to Administrative Tools, and click Event Viewer.

    • Are there any error messages in the event log under the System Log for DHCP?

    • Is a router between the client and the DHCP server not allowing DHCP packets through?

  2. When the client receives an IP address from the DHCP server, the message changes to DHCP. This indicates the client successfully leased an IP address and is now waiting to contact the RIS server.

    Troubleshooting: If the client does not get past the DHCP message, the client is not receiving a response from the remote installation server. Check the following possibilities:

    • Is the remote installation server available and has the (BINLSVC) RIS service started? RIS servers must be authorized in the Active Directory for their services to start. To ensure that the service has started, use the DHCP snap-in (click Start, point to Programs, point to Administrative Tools, and click DHCP).

    • Are other remote boot-enabled clients receiving the Client Installation wizard? If so, this may indicate this client computer is not supported or is having remote boot ROM-related problems. Check the version of the PXE ROM on the client computer.

    • Is a router between the client and the remote installation server not allowing the DHCP-based requests/responses through? When the RIS client and the RIS server are on separate subnets the router between the two systems must be configured to forward DHCP packets to the RIS server. This is because RIS clients discover a RIS server by using a DHCP broadcast message. Without DHCP forwarding set up on a router, the clients' DHCP broadcasts will never reach the RIS server. This DHCP forwarding process is sometimes referred to as DHCP Proxy or IP Helper Address in router configuration manuals.

      To verify DHCP set up, click Start, point to Programs, point to Administrative Tools, and click Event Viewer. Refer to your router instructions for setting up DHCP forwarding on your specific router.

    • Are any error messages in the event log under the System or Application logs specific to RIS (BINLSVC), DNS, or the Active Directory?

  3. The client changes to BINL or prompts the user to click the F12 key. This means that the client has contacted the RIS server and is waiting to TFTP the first image file-OSChooser. You might not see the BINL and TFTP message, because on some machines this sequence simply flashes by too quickly. (Note: Pressing F12 Key is automated, by swapping startrom.com and startrom.n12 files under <reminst_share>\OSChooser\i386 folder.

    Troubleshooting: If the client machine does not get a response from the Remote Installation Server, the client times out and displays an error that it did not receive a file from either DHCP, BINL, or TFTP. In this case, the RIS Server did not answer the client computer. Stop and restart the BINLSVC. From the Start menu, click Run, and type CMD. Enter these commands: Net Stop BINLSVC Net Start BINLSVC

    If the client machine does not receive an answer after attempting to stop and restart the service, check the Remote installation Server Object properties to ensure the correct setting has been set. Verify that RIS is set to "Respond to client computers requesting service", and "Do not respond to unknown client computers". Click Start, point to Programs, point to Administrative Tools, and click Event Viewer to check the Event log on the RIS server for any errors relating to DHCP, DNS, or RIS (BINLSVC).

  4. At this point, the client should have downloaded and displayed the Client Installation wizard application with a Welcome screen greeting the user.

Problem:

Is the Pre-Boot portion of the PXE-based Remote Boot ROM Secure?

Solution:

No. The entire ROM sequence and OS installation/replication is not secure with regard to packet type encryption, client/server spoofing, or wire sniffer based mechanisms. As such, use caution when using the RIS service on your corporate network. Ensure that you only allow authorized RIS servers on your network and that the number of administrators allowed to install and or configure RIS servers is controlled.

Problem:

While booting from the network, the target host displays the following error message:


No proxyDHCP offers were received.
Solution:

The client machine/target host is not able to obtain an IP address from the DHCP server. For more details, see Step 2 above. See the following Microsoft knowledge base articles:

Problem:

How do PXE Client, DHCP and RIS server interact?

Solution:

See the following Microsoft knowledge base article: Description of PXE Interaction Among PXE Client, DHCP, and RIS Server .

Problem:

Target host displays the following message while booting from the network:


ARP Timeout message
Solution:

You see this error message when the client machine gets a valid IP address from the DHCP server, but invalid PXE Boot server IP address (RIS server's IP address in the provisioning subnet) from the BINL service on the RIS server. This is observed on some old machines like HP-Lpr when they are run as multi-homed RIS servers. However, this problem does not occur on newer hardware like Hp-Proliant DL 360 G3 series of server machines from the same vendor, even when they are configured as multi-homed RIS servers. To enable old machines like HP-Lpr's to work as RIS servers without displaying this error nessage, make sure the machines are not multi-homed. In other words, the system should have only one enabled interface which is in the provisioning subnet.

For more information, see the following Microsoft knowledge base article: A multi-homed RIS server may not answer all clients, and you may receive an error message on PXE clients that are running Windows Server 2003 or Windows 2000.

Problem:

Text-mode installation does not boot.

Solution:

Try the following solutions:

Problem:

GUI-mode installation goes into interactive mode.

Solution:

Try the following solutions:

Problem:

How to change default timeout values for text-mode installation and GUI-mode installation for each client?

Solution:

Before starting the provisioning activity, make sure you change the default timeout values for the following properties in the ris.properties file (usually located under <n1osp folder>/etc/ ) on your N1 OS provisioning server.

ris.InitialBootTimeout
ris.OsInstallTimeout

Issues Related to Remote Information Services (RIS)

Problem:

How to Enable Debug Mode for Remote Install Servers?

Solution:

Follow the instructions as described in the Microsoft knowledge base article 236033.

Problem:

How do you automate the CIW screens for RIS services?

Solution:

See the following Microsoft knowledge base articles:

Problem:

Where can I find more information on Setup Information Answer files (.sif files)?

Solution:

See the deploy.cab file on Windows 2000/2003 Server Resource Kit CD for more details.

Problem:

How do you change the Administrator Password during RIS Installation?

Solution:

See the following Microsoft knowledge base article: How to Set the Administrator Password During RIS Installation - 257948.

Problem:

How do you add drivers to a RIS Image?

Solution:

See the following Microsoft knowledge base articles:

HOW TO: Add Third-Party OEM Network Adapters to RIS Installations - 246184

HOW TO: Add OEM Plug and Play Drivers to Windows Installations - 254078

Problem:

How do you slipstream a service pack into a RIS image?

Solution:

See the following articles on the Microsoft web site:

Problem:

You see the following error message during text-mode installation:


Illegal or Missing File Types Specified in Section SCSI.Name
Solution:

See the Microsoft knowledge base article 275334.

Problem:

You see an error message during text-mode installation when you try to install a RIS image. The error message includes:


Setup Cannot Continue
Solution:

See the Microsoft knowledge base article 830751.

Problem:

You see the following error message during text-mode installation:


INF File Tmp\<GUID_number.sif> Is Corrupt or Missing
Solution:

See the Microsoft knowledge base article 224830.

Problem:

You see the following error message during text-mode installation:


The Operating System Image You Selected Does Not Contain the Necessary Drivers
Solution:

See the Microsoft knowledge base article 247983.

Problem:

You see the following error message during text-mode installation:


The Operating System Image You Selected Does Not Contain the Necessary 
Drivers for Your Network Adapter. Try Selecting a Different Operating System 
Image. If the Problem Persists, Contact Your System Administrator.
Solution:

See the Microsoft knowledge base article 315074.

Problem:

The remote install client hangs at the end of text-mode set up.

Solution:

See the Microsoft knowledge base article 226941.

Problem:

The RIS set up stops responding on the “Setup is Starting Windows” screen.

Solution:

See the Microsoft knowledge base article 320865.

Issues Related to GUID

Problem:

Where do I look on the client computer to find the GUID/UUID for pre-staging clients in the Active Directory for use with RIS?

Solution:

The GUID/UUID for client computers that are PC98 or Net PC compliant can be found (in most cases) in the system BIOS. OEMs are encouraged to ship a floppy disk containing a comma-separated file or spreadsheet that contains a mapping of serial number to GUID/UUID. This allows you to script pre-staging client computers within the Active Directory. OEMs are also encouraged to post the GUID/UUID on the outside of the computer case for easy identification and pre-staging of computer accounts. If the GUID is not found in the above-mentioned locations, you can sniff the network traffic of the client to locate the DHCP Discover packet. Within the DCHP Discover packet, you can find the 128-bit 32 byte GUID/UUID.

Problem:

Two client machines have the same GUID value.

Solution:

RIS fails in this case, because RIS identifies each target host as a computer object in its active directory with an unique GUID value. If multiple objects have the same GUID, the RIS client machine throws an error during its setup phase. You see the following message:


BINLSVC found Duplicate GUID accounts on the RIS Server. 
Please contact your system Administrator.

To overcome this issue, delete any old computer accounts with the same GUID in the RIS server's Active Directory before proceeding further.