6.19.2.1.3 Recover the KVM Host on Exadata X8M-2

This procedure describes how to recover the KVM host on an Oracle Exadata X8M-2 database server.

  1. Boot the server in diagnostic mode.
    See Booting a Server using the Diagnostic ISO File in Oracle Exadata System Software User's Guide.
  2. Log in to the diagnostics shell as the root user.
    When prompted, enter the diagnostics shell.

    For example:

    Choose from following by typing letter in '()':
    (e)nter interactive diagnostics shell. Must use credentials 
    from Oracle support to login (reboot or power cycle to exit
    the shell),
    (r)estore system from NFS backup archive, 
    Type e to enter the diagnostics shell and log in as the root user.
    If prompted, log in to the system as the root user. If you are prompted for the root user password and do not have it, then contact Oracle Support Services.
  3. If required, use /opt/MegaRAID/storcli/storcli64 to configure the disk controller to set up the disks.
  4. Remove the logical volumes, the volume group, and the physical volume, in case they still exist after the disaster.
    # lvm vgremove VGExaDb --force
    # lvm pvremove /dev/sda3 --force
  5. Remove the existing partitions, then verify all partitions were removed. The below script can be used.
    # for v_partition in $(parted -s /dev/sda print|awk '/^ / {print $1}')
    do
      parted -s /dev/sda rm ${v_partition}
    done
     
    # parted  -s /dev/sda unit s print
    Model: AVAGO MR9[ 2783.921605]  sda:361-16i (scsi)
    Disk /dev/sda: 3509760000s
    Sector size (logical/physical): 512B/512B
    Partition Table: gpt
    Disk Flags:  
    
    Number  Start   End  Size  File system Name  Flags
  6. Create three partitions on /dev/sda
    1. Get the end sector for the disk /dev/sda from a running KVM host and store it in a variable:
      # end_sector_logical=$(parted -s /dev/sda unit s print|perl -ne '/^Disk\s+\S+:\s+(\d+)s/
                and print $1')
      # end_sector=$( expr $end_sector_logical - 34 )
      # echo $end_sector
      The values for the start and end sectors in the commands below were taken from an existing KVM host. Because these values can change over time, it is recommended that these values are checked from a KVM host at the time of performing this procedure. For example, for an Oracle Exadata Database Machine X8M-2 database server the following might be seen:
      # parted -s /dev/sda  unit s print
      Model:  AVAGO MR9361-16i (scsi)
      Disk  /dev/sda: 7025387520s
      Sector  size (logical/physical): 512B/512B
      Partition  Table: gpt
      Disk  Flags:  
      Number   Start     End         Size         File system   Name     Flags  
      1        64s       1048639s    1048576s     xfs           primary  boot  
      2        1048640s  1572927s    524288s      fat32         primary  boot  
      3        1572928s  7025387486s 7023814559s                primary  lvm
    2. Create the boot partition, /dev/sda1.
      # parted -s /dev/sda  mklabel gpt mkpart primary 64s 1048639s set 1 boot on
    3. Create the efi boot partition , /dev/sda2.
      # parted -s /dev/sda  mkpart primary fat32 1048640s 1572927s set 2 boot on
    4. Create the partition that will hold the logical volumes, /dev/sda3.
      # parted -s /dev/sda mkpart primary 1572928s ${end_sector}s set 3 lvm on
    5. Verify all the partitions have been created.
      # parted -s /dev/sda unit s print
      Model: AVAGO MR9[2991.834796]  sda: sda1 sda2 sda3
      361-16i(scsi)
      Disk /dev/sda:3509760000s
      Sector size(logical/physical): 512B/512B
      Partition Table:gpt
      Disk Flags:
      Number  Start    End            Size          File system    Name    Flags 
      1       64s      1048639s       1048576s      xfs            primary boot 
      2       1048640s 1572927s       524288s       fat32          primary boot   
      3       1572928s 3509759966s    3508187039s                  primary lvm
  7. Create logical volumes and file systems.
    1. Create the physical volume and the volume group.
      # lvm pvcreate /dev/sda3
      # lvm vgcreate VGExaDb /dev/sda3
    2. Create and label the logical volume for the file system that will contain the first system partition and build a xfs file system on it.
      # lvm lvcreate -n LVDbSys1 -L15G VGExaDb -y
      # mkfs.xfs -L DBSYS /dev/VGExaDb/LVDbSys1 -f
    3. Create and label the logical volume for the swap directory.
      # lvm lvcreate -n LVDbSwap1 -L16G VGExaDb -y
      # mkswap -L SWAP /dev/VGExaDb/LVDbSwap1
    4. Create the logical volume for the second system partition and build a xfs file system on it.
      # lvm lvcreate -n LVDbSys2 -L15G VGExaDb -y
      # mkfs.xfs /dev/VGExaDb/LVDbSys2
    5. Create and label the logical volume for the HOME partition and build a xfs file system on it.
      # lvm lvcreate -n LVDbHome -L4G VGExaDb -y
      # mkfs.xfs -L HOME /dev/VGExaDb/LVDbHome
    6. Create the logical volume for the tmp partition and build a xfs file system on it.
      # lvm lvcreate -n LVDbTmp -L3G VGExaDb -y
      # mkfs.xfs -L TMP /dev/VGExaDb/LVDbTmp -f
    7. Create the logical volume for the first var partition and build a xfs file system on it.
      # lvm lvcreate -n LVDbVar1 -L2G VGExaDb -y
      # mkfs.xfs -L VAR /dev/VGExaDb/LVDbVar1 -f
    8. Create the logical volume for the second var partition and build a xfs file system on it.
      # lvm lvcreate -n LVDbVar2 -L2G VGExaDb -y
      # mkfs.xfs /dev/VGExaDb/LVDbVar2 -f
    9. Create and label the logical volume for the LVDbVarLog partition and build a xfs file system on it.
      # lvm lvcreate -n LVDbVarLog -L18G VGExaDb -y
      # mkfs.xfs -L DIAG /dev/VGExaDb/LVDbVarLog -f
    10. Create and label the logical volume for the LVDbVarLogAudit partition and build a xfs file system on it.
      # lvm lvcreate -n LVDbVarLogAudit -L1G VGExaDb -y
      # mkfs.xfs -L AUDIT /dev/VGExaDb/LVDbVarLogAudit -f
    11. Create the LVDoNotRemoveOrUse logical volume.
      # lvm lvcreate -n LVDoNotRemoveOrUse -L2G VGExaDb -y
    12. Create the logical volume for the guest storage repository and build a xfs file system on it.
      # lvm lvcreate -n LVDbExaVMImages -L1500G VGExaDb -y
      # mkfs.xfs -m crc=1 -m reflink=1 -L EXAVMIMAGES /dev/VGExaDb/LVDbExaVMImages -f
    13. Create a file system on the /dev/sda1 partition, and label it.
      # mkfs.xfs -L BOOT /dev/sda1 -f
    14. Create a file system on the /dev/sda2 partition, and label it.
      # mkfs.vfat -v -c -F 32 -s 2 /dev/sda2
      # dosfslabel /dev/sda2 ESP
  8. Create mount points for all the partitions and mount the respective partitions.
    For example, if /mnt is used as the top-level directory, the mounted list of partitions might look like:
    /dev/VGExaDb/LVDbSys1 on /mnt
    /dev/sda1 on /mnt/boot
    /dev/sda2 on /mnt/boot/efi
    /dev/VGExaDb/LVDbHome on /mnt/home
    /dev/VGExaDb/LVDbTmp on /mnt/tmp
    /dev/VGExaDb/LVDbVar1 on /mnt/var
    /dev/VGExaDb/LVDbVarLog on /mnt/var/log
    /dev/VGExaDb/LVDbVarLogAudit on /mnt/var/log/audit
    /dev/VGExaDb/LVDbExaVMImages on /mnt/EXAVMIMAGES
    The following example mounts the system partition and creates 2 mount points for the boot partitions.
    # mount /dev/VGExaDb/LVDbSys1 /mnt -t xfs
    # mkdir /mnt/boot
    # mount /dev/sda1 /mnt/boot -t xfs
    # mkdir /mnt/boot/efi
    # mount /dev/sda2 /mnt/boot/efi -t vfat
    # mkdir /mnt/home
    # mount /dev/VGExaDb/LVDbHome /mnt/home -t xfs
    # mkdir /mnt/tmp
    # mount /dev/VGExaDb/LVDbTmp /mnt/tmp -t xfs
    # mkdir /mnt/var
    # mount /dev/VGExaDb/LVDbVar1 /mnt/var -t xfs
    # mkdir /mnt/var/log
    # mount /dev/VGExaDb/LVDbVarLog /mnt/var/log -t xfs
    # mkdir /mnt/var/log/audit
    # mount /dev/VGExaDb/LVDbVarLogAudit /mnt/var/log/audit -t xfs
    # mkdir /mnt/EXAVMIMAGES
    # mount /dev/VGExaDb/LVDbExaVMImages /mnt/EXAVMIMAGES -t xfs
  9. Bring up the network on eth0 and (if not using DHCP) assign the host's IP address and netmask to it. If using DHCP then manually configure the IP address for the host.
    # ip link set eth0 up
    # ip address add ip_address_for_eth0/netmask_for_eth0 dev eth0
    # ip route add default via gateway_ip_address dev eth0
  10. Mount the NFS server holding the backup.
    # mkdir -p /root/mnt
    # mount -t nfs -o ro,intr,soft,proto=tcp,nolock nfs_ip:/location_of_backup  /root/mnt
  11. Restore the files from the backup.

    Assuming that the backup was created using the procedure in Backing up the KVM host Using Snapshot-Based Backup, you can restore the files by using the following command:

    # tar --acls --xattrs --xattrs-include=* --format=pax -pjxvf /root/mnt/myKVMbackup.tar.bz2 -C /mnt
  12. Create the directory for kdump service.
    # mkdir /mnt/EXAVMIMAGES/crashfiles
  13. Set the boot device using efibootmgr.
    1. Disable and delete the Oracle Linux boot device.
      If the entry ExadataLinux_1 is seen, then remove this entry and recreate it. Example:
      # efibootmgr
      BootCurrent:  0000
      Timeout:  1 seconds
      BootOrder: 0000,0001,000A,000B,0007,0008,0004,0005
      Boot0000*  ExadataLinux_1
      Boot0001*  NET0:PXE IP4 Intel(R) I210 Gigabit Network Connection
      Boot0004*  PCIE1:PXE IP4 Oracle dual 25Gb Ethernet Adapter or dual 10Gb Ethernet Adapter
      Boot0005*  PCIE1:PXE IP4 Oracle dual 25Gb Ethernet Adapter or dual 10Gb Ethernet  Adapter
      Boot0007*  NET1:PXE IP4 Oracle Dual Port 10Gb/25Gb SFP28 Ethernet Controller
      Boot0008*  NET2:PXE IP4 Oracle Dual Port 10Gb/25Gb SFP28 Ethernet Controller
      Boot000A   PCIE2:PXE IP4 Mellanox Network Adapter - 50:6B:4B:CB:EF:F2
      Boot000B   PCIE2:PXE IP4 Mellanox Network Adapter - 50:6B:4B:CB:EF:F3
      MirroredPercentageAbove4G:  0.00
      MirrorMemoryBelow4GB:  false    
      In this example, ExadataLinux_1 (Boot000) would be disabled and removed. Use the commands below to disable and delete the boot device.

      Disable old ExadataLinux_1:

      # efibootmgr -b 0000 -A
      Delete old ExadataLinux_1:
      # efibootmgr -b 0000 -B
    2. Recreate the boot entry for ExadataLinux_1 and then view the boot order entries.
      # efibootmgr -c -d /dev/sda  -p 2 -l '\EFI\REDHAT\SHIM.EFI' -L 'ExadataLinux_1'
      # efibootmgr
      BootCurrent:  0000
      Timeout:  1 seconds
      BootOrder: 0000,0001,0007,0008,0004,0005,000B,000C
      Boot0000*  ExadataLinux_1
      Boot0001*  NET0:PXE IP4 Intel(R) I210 Gigabit Network Connection
      Boot0004*  PCIE1:PXE IP4 Oracle dual 25Gb Ethernet Adapter or dual 10Gb Ethernet  Adapter
      Boot0005*  PCIE1:PXE IP4 Oracle dual 25Gb Ethernet Adapter or dual 10Gb Ethernet  Adapter
      Boot0007*  NET1:PXE IP4 Oracle Dual Port 10Gb/25Gb SFP28 Ethernet Controller
      Boot0008*  NET2:PXE IP4 Oracle Dual Port 10Gb/25Gb SFP28 Ethernet Controller
      Boot000B*  PCIE2:PXE IP4 Mellanox Network Adapter - EC:0D:9A:CC:1E:46
      Boot000C*  PCIE2:PXE IP4 Mellanox Network Adapter - EC:0D:9A:CC:1E:47
      MirroredPercentageAbove4G: 0.00
      MirrorMemoryBelow4GB: false
    3. In the output from the efibootmgr command, make note of the boot order number for ExadataLinux_1 and use that value in the following commands.
      # efibootmgr -b entry number -A
      # efibootmgr -b entry number -a
      For example, in the previous output shown in step 13.b, ExadataLinux_1 was listed as (Boot0000). So you would use the following commands:
      # efibootmgr -b 0000 -A
      # efibootmgr -b 0000 -a
    4. Set the correct boot order. Set ExadataLinux_1 as the first boot device.
      The remaining devices should stay in the same boot order.
      # efibootmgr -o 0000,0001,0004,0005,0007,0008,000B,000C
    5. Check the boot order.
      The boot order should now look like the following:
      # efibootmgr
      BootCurrent: 0000
      Timeout: 1 seconds
      BootOrder: 0000,0001,0004,0005,0007,0008,000B,000C
      Boot0000* ExadataLinux_1
      Boot0001* NET0:PXE IP4 Intel(R) I210 Gigabit Network Connection
      Boot0004* PCIE1:PXE IP4 Oracle dual 25Gb Ethernet Adapter or dual 10Gb Ethernet Adapter
      Boot0005* PCIE1:PXE IP4 Oracle dual 25Gb Ethernet Adapter or dual 10Gb Ethernet Adapter
      Boot0007* NET1:PXE IP4 Oracle Dual Port 10Gb/25Gb SFP28 Ethernet Controller
      Boot0008* NET2:PXE IP4 Oracle Dual Port 10Gb/25Gb SFP28 Ethernet Controller
      Boot000B* PCIE2:PXE IP4 Mellanox Network Adapter - EC:0D:9A:CC:1E:46
      Boot000C* PCIE2:PXE IP4 Mellanox Network Adapter - EC:0D:9A:CC:1E:47
      MirroredPercentageAbove4G: 0.00
      MirrorMemoryBelow4GB: false 
    6. Check the boot order using the ubiosconfig command.
      # ubiosconfig export all -x /tmp/ubiosconfig.xml
      Make sure the ExadataLinux_1 entry is the first child element of boot_order.
      # cat /tmp/ubiosconfig.xml
      <boot_order>
       <boot_device>
        <description>ExadataLinux_1</description>
        <instance>1</instance> </boot_device>
       <boot_device>
        <description>NET0:PXE IP4 Intel(R) I210 Gigabit Network Connection</description>
        <instance>1</instance>
       </boot_device>
       <boot_device>
        <description>PCIE1:PXE IP4 Oracle dual 25Gb Ethernet Adapter or dual 10Gb Ethernet Adapter</description>
        <instance>1</instance> 
       </boot_device>
       <boot_device>
        <description>PCIE1:PXE IP4 Oracle dual 25Gb Ethernet Adapter or dual 10Gb Ethernet Adapter</description>
        <instance>2</instance> 
       </boot_device>
       <boot_device>
        <description>NET1:PXE IP4 Oracle Dual Port 10Gb/25Gb SFP28 Ethernet Controller</description>
        <instance>1</instance>
       </boot_device>
       <boot_device>
        <description>NET2:PXE IP4 Oracle Dual Port 10Gb/25Gb SFP28 Ethernet Controller</description>
        <instance>1</instance> 
       </boot_device>
       <boot_device>
        <description>PCIE2:PXE IP4 Mellanox Network Adapter - EC:0D:9A:CC:1E:46</description>
        <instance>1</instance>
       </boot_device>
       <boot_device> <description>PCIE2:PXE IP4 Mellanox Network Adapter - EC:0D:9A:CC:1E:47</description>
        <instance>1</instance> </boot_device> </boot_order>
    7. Check the restored fstab file and comment out any reference to /EXAVMIMAGES.

      Navigate to /mnt/etc.

      # cd /mnt/etc

      In the /mount/etc/fstab file, comment out any line that references /EXAVMIMAGES.

The KVM host has been recovered.