System Administration Guide: Basic Administration

Chapter 14 Troubleshooting Booting an Oracle Solaris System (Tasks)

This chapter describes the procedures for booting Oracle Solaris on SPARC and x86 based systems.

The following is a list of the information that is in this chapter:

Troubleshooting Booting on the SPARC Platform (Task Map)

Task 

Description 

For Instructions 

Stop a system for recovery purposes.  

If a damaged file is preventing the system from booting normally, first stop the system to attempt recovery. 

SPARC: How to Stop the System for Recovery Purposes

Force a crash dump of and reboot of the system. 

You can force a crash dump and reboot of the system as a troubleshooting measure. 

SPARC: How to Force a Crash Dump and Reboot of the System

Boot a SPARC based system for recovery purposes. 

Boot to repair an important system file that is preventing the system from booting successfully. 

SPARC: How to Boot a System for Recovery Purposes

Boot a SPARC based system that has an Oracle Solaris ZFS root for recovery purposes. 

Boot a system to recover the root password or a similar problem that prevents you from successfully logging into an Oracle Solaris ZFS root environment, you will need to boot failsafe mode or boot from alternate media, depending on the severity of the error.

SPARC: How to Boot to a ZFS Root Environment to Recover From a Lost Password or Similar Problem

Boot a system with the kernel debugger. 

You can the system with the kernel debugger to troubleshoot booting problems. Use the kmdb command to boot the system.

SPARC: How to Boot the System With the Kernel Debugger (kmdb)

You might need to use one or more of the following methods to troubleshoot problems that prevent the system from booting successfully.

ProcedureSPARC: How to Stop the System for Recovery Purposes

  1. Type the Stop key sequence for your system.

    The monitor displays the ok PROM prompt.


    ok

    The specific Stop key sequence depends on your keyboard type. For example, you can press Stop-A or L1-A. On terminals, press the Break key.

  2. Synchronize the file systems.


    ok sync
    
  3. When you see the syncing file systems... message, press the Stop key sequence again.

  4. Type the appropriate boot command to start the boot process.

    For more information, see the boot(1M) man page.

  5. Verify that the system was booted to the specified run level.


    # who -r
     .       run-level s  May  2 07:39     3      0  S

Example 14–1 SPARC: Stopping the System for Recovery Purposes


Press Stop-A
ok sync
syncing file systems...
Press Stop-A
ok boot

SPARC: Forcing a Crash Dump and Reboot of the System

Forcing a crash dump and reboot of the system are sometimes necessary for troubleshooting purposes. The savecore feature is enabled by default.

For more information about system crash dumps, see Chapter 17, Managing System Crash Information (Tasks), in System Administration Guide: Advanced Administration.

ProcedureSPARC: How to Force a Crash Dump and Reboot of the System

Use this procedure to force a crash dump of the system. The example that follows this procedure shows how to use the halt -d command to force a crash dump of the system. You will need to manually reboot the system after running this command.

  1. Type the stop key sequence for your system.

    The specific stop key sequence depends on your keyboard type. For example, you can press Stop-A or L1-A. On terminals, press the Break key.

    The PROM displays the ok prompt.

  2. Synchronize the file systems and write the crash dump.


    > n
    ok sync
    

    After the crash dump is written to disk, the system will continue to reboot.

  3. Verify the system boots to run level 3.

    The login prompt is displayed when the boot process has finished successfully.


    hostname console login:

Example 14–2 SPARC: Forcing a Crash Dump and Reboot of the System by Using the halt -d Command

This example shows how to force a crash dump and reboot of the system jupiter by using the halt -d and boot command. Use this method to force a crash dump and reboot of the system.


# halt -d
Jul 21 14:13:37 jupiter halt: halted by root

panic[cpu0]/thread=30001193b20: forced crash dump initiated at user request

000002a1008f7860 genunix:kadmin+438 (b4, 0, 0, 0, 5, 0)
  %l0-3: 0000000000000000 0000000000000000 0000000000000004 0000000000000004
  %l4-7: 00000000000003cc 0000000000000010 0000000000000004 0000000000000004
000002a1008f7920 genunix:uadmin+110 (5, 0, 0, 6d7000, ff00, 4)
  %l0-3: 0000030002216938 0000000000000000 0000000000000001 0000004237922872
  %l4-7: 000000423791e770 0000000000004102 0000030000449308 0000000000000005

syncing file systems... 1 1 done
dumping to /dev/dsk/c0t0d0s1, offset 107413504, content: kernel
100% done: 5339 pages dumped, compression ratio 2.68, dump succeeded
Program terminated
ok boot
Resetting ... 

Sun Ultra 5/10 UPA/PCI (UltraSPARC-IIi 333MHz), No Keyboard
OpenBoot 3.15, 128 MB memory installed, Serial #10933339.
Ethernet address 8:0:20:a6:d4:5b, Host ID: 80a6d45b.

Rebooting with command: boot
Boot device: /pci@1f,0/pci@1,1/ide@3/disk@0,0:a
File and args: kernel/sparcv9/unix
SunOS Release 5.10 Version s10_60 64-bit
Copyright 1983-2004 Sun Microsystems, Inc.  All rights reserved.
Use is subject to license terms.
configuring IPv4 interfaces: hme0.
add net default: gateway 172.20.27.248
Hostname: jupiter
The system is coming up.  Please wait.
NIS domain name is example.com
.
.
.
System dump time: Wed Jul 21 14:13:41 2004
Jul 21 14:15:23 jupiter savecore: saving system crash dump
in /var/crash/jupiter/*.0
Constructing namelist /var/crash/jupiter/unix.0
Constructing corefile /var/crash/jupiter/vmcore.0
100% done: 5339 of 5339 pages saved

Starting Sun(TM) Web Console Version 2.1-dev...
.
.
.

ProcedureSPARC: How to Boot a System for Recovery Purposes

Use this procedure when an important file, such as /etc/passwd, has an invalid entry and causes the boot process to fail.

Use the stop sequence described in this procedure if you do not know the root password or if you can't log in to the system. For more information, see SPARC: How to Stop the System for Recovery Purposes.

Substitute the device name of the file system to be repaired for the device-name variable in the following procedure. If you need help identifying a system's device names, refer to Displaying Device Configuration Information in System Administration Guide: Devices and File Systems.

  1. Stop the system by using the system's Stop key sequence.

  2. Boot the system in single-user mode.

    • Boot the system from the Oracle Solaris installation media:

      • Insert the Oracle Solaris installation media into the drive.

      • Boot from the installation media in single-user mode.


        ok boot cdrom -s
        
    • Boot the system from the network if an installation server or remote CD or DVD drive is not available.


      ok boot net -s
      
  3. Mount the file system that contains the file with an invalid entry.


    # mount /dev/dsk/device-name /a
    
  4. Change to the newly mounted file system.


    # cd /a/file-system
    
  5. Set the terminal type.


    # TERM=sun
    # export TERM
    
  6. Remove the invalid entry from the file by using an editor.


    # vi filename
    
  7. Change to the root (/) directory.


    # cd /
    
  8. Unmount the /a directory.


    # umount /a
    
  9. Reboot the system.


    # init 6
    
  10. Verify that the system booted to run level 3.

    The login prompt is displayed when the boot process has finished successfully.


    hostname console login:

Example 14–3 SPARC: Booting a System for Recovery Purposes (Damaged Password File)

The following example shows how to repair an important system file (in this case, /etc/passwd) after booting from a local CD-ROM.


ok boot cdrom -s
# mount /dev/dsk/c0t3d0s0 /a
# cd /a/etc
# TERM=vt100
# export TERM
# vi passwd
(Remove invalid entry)
# cd /
# umount /a
# init 6


Example 14–4 SPARC: Booting a System If You Forgot the root Password

The following example shows how to boot the system from the network when you have forgotten the root password. This example assumes that the network boot server is already available. Be sure to apply a new root password after the system has rebooted.


ok boot net -s
# mount /dev/dsk/c0t3d0s0 /a
# cd /a/etc
# TERM=vt100
# export TERM
# vi shadow
(Remove root's encrypted password string)
# cd /
# umount /a
# init 6

ProcedureSPARC: How to Boot to a ZFS Root Environment to Recover From a Lost Password or Similar Problem

  1. Boot the system in failsafe mode.


    ok boot -F failsafe
    
  2. When prompted, mount the ZFS BE on /a.


    .
    .
    ROOT/zfsBE was found on rpool.
    Do you wish to have it mounted read-write on /a? [y,n,?] y
    mounting rpool on /a
    Starting shell.
  3. Become superuser.

  4. Change to the /a/etc directory.


    # cd /a/etc
    
  5. Correct the passwd or shadow file.


    # vi passwd
    
  6. Reboot the system.


    # init 6
    

ProcedureSPARC: How to Boot the System With the Kernel Debugger (kmdb)

This procedure shows you the basics for loading the kernel debugger (kmdb). For more detailed information, see the Oracle Solaris Modular Debugger Guide.


Note –

Use the reboot and halt command with the -d option if you do not have time to debug the system interactively. To run the halt command with the -d option requires a manual reboot of the system afterwards. Whereas, if you use the reboot command, the system boots automatically. See the reboot(1M) for more information.


  1. Halt the system, causing it to display the ok prompt.

    To halt the system gracefully, use the /usr/sbin/halt command.

  2. Type either boot kmdb or boot -k to request the loading of the kernel debugger. Press return.

  3. Enter the kernel debugger.

    The method used to enter the debugger is dependent upon the type of console that is used to access the system:

    • If a locally attached keyboard is being used, press Stop-A or L1–A, depending upon the type of keyboard.

    • If a serial console is being used, send a break by using the method that is appropriate for the type of serial console that is being used.

    A welcome message is displayed when you enter the kernel debugger for the first time.


    Rebooting with command: kadb
    Boot device: /iommu/sbus/espdma@4,800000/esp@4,8800000/sd@3,0
    .
    .
    .

Example 14–5 SPARC: Booting a System With the Kernel Debugger (kmdb)


ok boot kmdb
Resetting...

Executing last command: boot kmdb -d
Boot device: /pci@1f,0/ide@d/disk@0,0:a File and args: kmdb -d
Loading kmdb...

Troubleshooting Booting on the x86 Platform (Task Map)

Task 

Description 

For Instructions 

Stop a system for recovery purposes.  

If a damaged file is preventing the system from booting normally, first stop the system to attempt recovery. 

x86: How to Stop a System for Recovery Purposes

Force a crash dump of and reboot of the system. 

You can force a crash dump and reboot of the system as a troubleshooting measure. 

x86: How to Force a Crash Dump and Reboot of the System

Boot a system with the kernel debugger. 

You can the system with the kernel debugger to troubleshoot booting problems. Use the kmdb command to boot the system.

x86: How to Boot a System With the Kernel Debugger in the GRUB Boot Environment (kmdb)

Procedurex86: How to Stop a System for Recovery Purposes

  1. Stop the system by using one of the following commands, if possible:

    • If the keyboard and mouse are functional, become superuser. Then, type init 0 to stop the system. After the Press any key to reboot prompt appears, press any key to reboot the system.

    • If the keyboard and mouse are functional, become superuser, then, type init 6 to reboot the system.

  2. If the system does not respond to any input from the mouse or the keyboard, press the Reset key, if it exists, to reboot the system.

    Or, you can use the power switch to reboot the system.

x86: Forcing a Crash Dump and Reboot of the System

Forcing a crash dump and reboot of the system are sometimes necessary for troubleshooting purposes. The savecore feature is enabled by default.

For more information about system crash dumps, see Chapter 17, Managing System Crash Information (Tasks), in System Administration Guide: Advanced Administration.

Procedurex86: How to Force a Crash Dump and Reboot of the System

If you cannot use the reboot -d or the halt -d command, you can use the kernel debugger, kmdb, to force a crash dump. The kernel debugger must have been loaded, either at boot, or with the mdb -k command, for the following procedure to work.


Note –

You must be in text mode to access the kernel debugger (kmdb). So, first exit any window system.


  1. Access the kernel debugger.

    The method used to access the debugger is dependent upon the type of console that you are using to access the system.

    • If you are using a locally attached keyboard, press F1–A.

    • If you are using a serial console, send a break by using the method appropriate to that type of serial console.

    The kmdb prompt is displayed.

  2. To induce a crash, use the systemdump macro.


    [0]> $<systemdump
    

    Panic messages are displayed, the crash dump is saved, and the system reboots.

  3. Verify that the system has rebooted by logging in at the console login prompt.


Example 14–6 x86: Forcing a Crash Dump and Reboot of the System by Using halt -d

This example shows how to force a crash dump and reboot of the x86 based system neptune by using the halt -d and boot commands. Use this method to force a crash dump of the system. Reboot the system afterwards manually.


# halt -d
4ay 30 15:35:15 wacked.Central.Sun.COM halt: halted by user

panic[cpu0]/thread=ffffffff83246ec0: forced crash dump initiated at user request

fffffe80006bbd60 genunix:kadmin+4c1 ()
fffffe80006bbec0 genunix:uadmin+93 ()
fffffe80006bbf10 unix:sys_syscall32+101 ()

syncing file systems... done
dumping to /dev/dsk/c1t0d0s1, offset 107675648, content: kernel
NOTICE: adpu320: bus reset
100% done: 38438 pages dumped, compression ratio 4.29, dump succeeded

Welcome to kmdb
Loaded modules: [ audiosup crypto ufs unix krtld s1394 sppp nca uhci lofs 
genunix ip usba specfs nfs md random sctp ]
[0]> 
kmdb: Do you really want to reboot? (y/n) y

Procedurex86: How to Boot a System With the Kernel Debugger in the GRUB Boot Environment (kmdb)

This procedure shows the basics for loading the kernel debugger (kmdb). The savecore feature is enabled by default. For more detailed information about using the kernel debugger, see the Oracle Solaris Modular Debugger Guide.

  1. Boot the system.

    The GRUB menu is displayed when the system is booted.

  2. When the GRUB menu is displayed, type e to access the GRUB edit menu.

  3. Use the arrow keys to select the kernel$ line.

    If you cannot use the arrow keys, use the ^ key to scroll up and the v key to scroll down.

  4. Type e to edit the line.

    The boot entry menu is displayed. In this menu, you can modify boot behavior by adding additional boot arguments to the end of the kernel$ line.

  5. Type -k at the end of the line.

  6. Press enter to return to the GRUB main menu.

  7. Type b to boot the system with the kernel debugger enabled.

  8. Access the kernel debugger.

    The method used to access the debugger is dependent upon the type of console that you are using to access the system:

    • If you are using a locally attached keyboard, press F1–A.

    • If you are using a serial console, send a break by using the method appropriate to that type of serial console.

    A welcome message is displayed when you access the kernel debugger for the first time.


Example 14–7 x86: Booting a System With the Kernel Debugger (GRUB Multiboot Implementation)

This example shows how to manually boot a 64-bit capable x86 based system with the kernel debugger enabled.


kernel$ /platform/i86pc/multiboot kernel/amd64/unix -k -B $ZFS-BOOTFS

This example shows how to boot a 64-bit capable x86 based system 32-bit mode with the kernel debugger enabled.


kernel$ /platform/i86pc/multiboot kernel/unix -k -B $ZFS-BOOTFS