C H A P T E R  9

Troubleshooting the Linux PXE Boot Installation

This appendix provides information on common problems that may occur during or after a PXE boot installation.

Errors During Startup

The following errors appear at startup when PXE booting the blade:

PXE-E51: No DHCP or proxyDHCP offers were received.
PXE-M0F: Exiting Broadcom ROM.

Cause

The DHCP service is not configured correctly.

Solution

To ensure that the DHCP service is running on the DHCP server and monitoring the correct port, use the following netstat command:

$ netstat -an | fgrep -w 67
udp        0      0 0.0.0.0:67              0.0.0.0:*

 

If no listening socket is shown, check your DHCP setup and configuration. If a listening socket is shown, this may indicate another problem such as firewall filtering or cabling issues.

Errors After Obtaining IP Address (Issue 1)

During a PXE boot installation, the following errors appear after obtaining the IP address:

PXE-E53: No boot filename received
PXE-M0F: Exiting Broadcom PXE ROM.

Cause

The DHCP service did not provide the name of a boot file.

Solution

Ensure that the filename command is correctly specified in the /etc/dhcpd.conf file on the PXE server.

This problem may also occur if the DHCP lease is received from a different machine. Normally, only one DHCP server should be configured on a single network segment.

Errors After Obtaining IP Address (Issue 2)

During a PXE boot installation, the following errors appear after obtaining the IP number:

PXE-E32: TFTP Open timeout

Cause

The TFTP service is not configured correctly.

Solution

To ensure that the TFTP service is running and monitoring the correct port, use the following netstat command:

$ netstat -an | fgrep -w 69
udp        0      0 0.0.0.0:69              0.0.0.0:*

If no listening socket is shown, check your TFTP setup and configuration. If a listening socket is shown, this may indicate another problem such as firewall filtering or cabling issues.

To test the TFTP service, try installing a TFTP client on a different machine and attempt to download the pxelinux.bin file:

# cd /tmp
# tftp PXE-server
tftp> get /as-2.1/sun/pxelinux.bin
Received 10960 bytes in 0.1 seconds
tftp> quit

Errors After Obtaining IP Address (Issue 3)

During a PXE boot installation, the following errors appear after obtaining the IP address:

PXE-T01: File not found
PXE-E3B: TFTP Error - File Not found
PXE-M0F: Exiting Broadcom PXE ROM.

Cause

The boot file name does not exist on the PXE server.

Solution

In the /etc/xinetd.d/tftp file on the PXE server:

It is recommended that you use -s /tftp, and ensure that the TFTP service uses chroot(1) to change its top level directory to /tftp. This means that the dhcp filename argument is relative to the top level directory (and does not include the section /tftp).

To test the TFTP service, try installing a TFTP client on a different machine and attempt to download a file:

# cd /tmp
# tftp PXE-server
tftp> get /as-2.1/sun/pxelinux.bin
Received 10960 bytes in 0.1 seconds
tftp> quit

Error After Installing the Linux Kernel (Issue 1)

During a PXE boot installation, the following error appears after loading the Linux kernel:

------+ Kickstart Error +-------+
|                                |
| Error opening: kickstart file  |
| /tmp/ks.cfg: No such file or   |
| directory                      |
|                                |
|            +----+              |
|            | OK |              |
|            +----+              |
|                                |
|                                |
+--------------------------------+

Cause

NFS is not working correctly on the PXE server.

Solution

Validate your NFS configuration by doing one or both of the following:

If this path is not in the output, check your NFS setup and configuration.

This problem may also occur if the blade is not correctly connected to the PXE server. If you have only one switch and system controller (SSC) installed on the chassis, ensure that the SSC is installed in position 0. See the Sun Fire B1600 Chassis Administration Guide for information on installing the SSC.

If the NFS services are working normally and can be used from other machines on the network, it is likely that the PXE server has provided the wrong kernel to the blade. This occurs if the linux distribution installed on the PXE server does not exactly match the linux distribution against which the supplemental CD (supplied with the Linux blade) was built. An exact match is necessary to ensure that module versioning does not cause the 5704 network driver (suntg3) to fail to load.

Root Password Message After Installing the Linux Kernel

During a PXE boot installation, the following message appears after loading the Linux kernel:

+--------------+ Root Password +---------------+
|                                              |
|  Pick a root password. You must type it      |
|  twice to ensure you know what it is and     |
|  didn't make a mistake in typing. Remember   |
|  that the root password is a critical part   |
|  of system security!                         |
|                                              |
| Password:           ________________________ |
| Password (confirm): ________________________ |
|                                              |
|        +----+               +------+         |
|        | OK |               | Back |         |
|        +----+               +------+         |
|                                              |
|                                              |
+----------------------------------------------+

Cause

No default root password has been specified in ks.cfg.

Solution

In the sun/install/ks.cfg file, ensure that the rootpw command is not commented out, and that you have specified a root password. See Chapter 4 for information on entering a root password.

Error After Rebooting

After completing a PXE boot installation and rebooting, the following screen appears:

GRUB  version 0.92  (634K lower / 522176K upper memory)
 
[ Minimal BASH-like line editing is supported.  For the first word, TAB lists possible command completions.  Anywhere else TAB lists the possible completions of a device/filename. ]
 
grub>

Cause

The PXE boot installation did not complete.

Solution

This problem may occur if the blade is removed or powered off during installation. You must re-install the blade.

Blade Does Not Boot From The Disk

After successfully completing a PXE boot installation, the blade continues to boot from the network instead of the disk.

Cause

The BIOS is configured to boot from the network by default.

Solution

At the SC prompt, use the bootmode reset_nvram sn command to reset the BIOS to boot from the disk by default.

First Boot From Disk Runs fsck

When booting the blade from the disk for the first time, the blade runs fsck to fix filesystems.

Cause

The blade has not unmounted fileystems.

Solution

To unmount all file systems and enable the blade to reboot correctly, ensure that you press Enter at the final OK prompt during the PXE boot installation. See Chapter 4 for more information.

Installer Hangs or Fails During PXE Boot Installation

When PXE installing a blade, the installer does one of the following:

Cause

The PXE server may be using the eepro100 driver.

Solution

1. Check if the PXE server is using the eepro100 driver by examining the /etc/modules.conf file for a line equivalent to:

alias eth0 eepro100



Note - The eth instance may be different depending on your hardware setup.



2. Change the line to:

alias eth0 e100

This avoids a known interaction issue between the i82557/i82558 10/100 Ethernet hardware and the Broadcom 5704.

 

 

 

Prompted to Insert Module Disks During PXE Boot (SUSE only)

When booting a blade during a SuSE installation, the blade does not boot automatically and you are prompted to perform an interactive installation:

Please insert modules disk 3.

You'll find instructions on how to create it in boot/README on CD1 or DVD.

Cause

SuSE expects a default router to be supplied by the DHCP server, otherwise it assumes that the interface is not functional.

Solution

Ensure thst you have specified a default router in the dhcpd.conf file. For example:

ddns-update-style none;
default-lease-time 1800;
max-lease-time 3600;
:
option routers 172.16.11.6;
:
subnet 172.16.11.0 netmask 255.255.0.0 { 
  next-server 172.16.11.8;                   # name of your TFTP server
  filename "/<linux_dir>/sun/pxelinux.bin";    # name of the boot-loader program 
  range 172.16.11.100 172.16.11.200;         # dhcp clients IP range
}