The section contains information on known issues you may encounter when creating or using virtual machine, and explains how to resolve them.
PVM guests may perform their own system clock management, for example, using the NTPD (Network Time Protocol daemon), or the hypervisor may perform system clock management for all guests.
You can set
paravirtualized
guests to manage their own system clocks by setting the
xen.independent_wallclock
parameter to
1
in the /etc/sysctl.conf
file. For example:
"xen.independent_wallclock = 1"
If you want to set the hypervisor to manage paravirtualized guest
system clocks, set
xen.independent_wallclock
to
0
. Any attempts to set or modify the time in a
guest will fail.
You can temporarily override the setting in the
/proc
file. For example:
"echo 1 > /proc/sys/xen/independent_wallclock"
This setting does not apply to hardware virtualized guests.
Additional parameters may be needed in the boot loader
(grub.conf
) configuration file for certain
operating system variants after the guest is installed.
Specifically, for optimal clock accuracy, Linux guest boot
parameters should be specified to ensure that the
pit clock source is utilized. Adding
clock=pit nohpet nopmtimer
for most guest
will result in the selection of pit as the
clock source for the guest. Published templates for Oracle VM
include these additional parameters.
Proper maintenance of virtual time can be tricky. The various
parameters provide tuning for virtual time management and
supplement, but do not replace, the need for an
ntp time service running within guest. Ensure
that the ntpd
service is running and that
the /etc/ntp.conf
configuration file is
pointing to valid time servers.
If your mouse pointer fails to track your cursor in a VNC Viewer
session in a hardware virtualized guest, add the following to the
Oracle VM Server configuration file located at
/etc/xen/xend-config.sxp
to force the device
model to use absolute (tablet) coordinates:
usbdevice='tablet'
Restart the Oracle VM Server for the changes to take effect. You may need to do this for each Oracle VM Server in the server pool.
When creating a virtual machine from an Oracle VM
2.x
template, the
clone job fails with the
error:
OVMAPI_9039E Cannot place clone VM:template_name
.tgz, in Server Pool: server-pool-uuid
. That server pool has no servers that can run the VM.
This is caused by a network configuration inconsistency with the
vif = ['bridge=xenbr0']
entry in the virtual
machine's configuration file.
To resolve this issue, remove any existing networks in the virtual machine template, and replace them with valid networks which have the Virtual Machine role. Start the clone job again and the virtual machine clone is created. Alternatively, remove any existing networks in the template, restart the clone job, and add in any networks after the clone job is complete.
When running hardware virtualized guests, the QEMU process (qemu-dm) may have its memory usage grow substantially, especially under heavy I/O loads. This may cause the hardware virtualized guest to stop as it runs out of memory. If the guest is stopped, increase the memory allocation for dom0, for example from 512 MB to 768 MB. See Section 1.6, “Changing the Memory Size of the Management Domain” for information on changing the dom0 memory allocation.
You cannot migrate virtual machines on computers with hardware that is not identical. To migrate virtual machines, you must have hardware that is the same make and model and the CPU must be in the same CPU family.
Virtual machines can be live migrated between instances of Oracle VM Server that are at the same release or later. For virtual machines running on an x86 platform, a rule exception is generated if you attempt to live migrate a virtual machine to an Oracle VM Server with an earlier release than the Oracle VM Server where the virtual machine is running.
In the event where a virtual machine hosted on a local repository is live migrated and the migration source, or target, Oracle VM Server becomes unavailable during the migration, Oracle VM Manager attempts to perform a rollback of the operation. This rollback process brings the original version of the virtual machine back online on the source Oracle VM Server and then performs a cleanup operation on the target Oracle VM Server when it becomes available again. This cleanup process involves killing the paused virtual machine that may have been copied to the target Oracle VM Server and then cleaning the target repository of virtual disks, virtual machine configurations and temporary files. Finally a repository refresh is performed on the repository on the source server to ensure that everything is in order.
Before the cleanup operation is triggered, an event is created within Oracle VM Manager to indicate that the migration job has failed or been aborted and to track the rollback process. When the event is generated within Oracle VM Manager, it is set with a 'WARNING' status. The rollback process is generated as a set of up to three different jobs that are each given a timeout period of 15 minutes, and which are triggered to attempt to run every 10 seconds. If these jobs succeed, Oracle VM Manager acknowledges the event. If the jobs all timeout, Oracle VM Manager still acknowledges the event, but a second user-acknowledgeable event is created with 'WARNING' status to indicate that the rollback failed. Depending on the cause of the rollback failure, Oracle VM Manager might also create user-acknowledgeable events with 'CRITICAL' status.
Because jobs are usually performed sequentially, it may take a
total of 45 minutes before the entire rollback process times out
and the new event indicating rollback failure is generated. The
rollback failure event is also logged in the the log file
/u01/app/oracle/ovm-manager-3/domains/ovm_domain/servers/AdminServer/logs/AdminServer.log
on the Oracle VM Manager host.
The information in the rollback failure event contains the rollback plan that Oracle VM Manager attempted to follow to cleanup a failed virtual machine migration. This event can be viewed using the getEventsForObject command with the Oracle VM Manager Command Line Interface, by viewing the events associated with the virtual machine within the Oracle VM Manager Web Interface or via the Oracle VM Web Services API.
The following content represents the typical output displayed within the description field for a rollback failure event:
Live VM Migration With Storage, started at 2015-11-04 09:51:13,205 VM: [VirtualMachineDbImpl] 0004fb0000060000c71d489702c240b3<978> (MyVM) Source server: [ServerDbImpl] 30:30:38:37:30:32:58:4d:51:34:35:30:30:37:4c:52<386> (ovs216) Target server: [ServerDbImpl] 30:30:38:37:30:32:58:4d:51:34:35:30:30:38:58:42<238> (ovs215) Source repository: [RepositoryDbImpl] 0004fb0000030000c4fca9a963e2706c<479> (r216) Target repository: [RepositoryDbImpl] 0004fb0000030000f9ba13d5063e330a<382> (r215) Source vDisks to be migrated: /OVS/Repositories/0004fb0000030000c4fca9a963e2706c/VirtualDisks/0004fb000012000005213553a5bba24f.img Migration job has failed or was aborted. VM's server has been set back to: (ovs216) Source vDisk files have been retained. Constructed the following post-migration completion plan at 2015-11-04 09:51:36,686 VM to be killed on server: (ovs215) To be deleted on target server (ovs215): vDisk: /OVS/Repositories/0004fb0000030000f9ba13d5063e330a/VirtualDisks/0004fb000012000005213553a5bba24f.img tmp file: /OVS/Repositories/0004fb0000030000f9ba13d5063e330a/VirtualDisks/tmp_dest.0004fb000012000005213553a5bba24f.img Also to be deleted on target server (ovs215): cfg file: /OVS/Repositories/0004fb0000030000f9ba13d5063e330a/VirtualMachines/0004fb0000060000c71d489702c240b3/vm.cfg directory: /OVS/Repositories/0004fb0000030000f9ba13d5063e330a/VirtualMachines/0004fb0000060000c71d489702c240b3 tmp file: /OVS/Repositories/0004fb0000030000f9ba13d5063e330a/VirtualMachines/tmp_dest.0004fb0000060000c71d489702c240b3 Source repository (r216) must be refreshed.
Note that the description of the event provides detailed information about the migration process and indicates that the migration job has failed. The message explains that the virtual machine is set back to run on the source server and that the source virtual disks have been retained. This means that the virtual machine may either be running or stopped on the source server, but from the perspective of Oracle VM Manager, the location of the virtual machine has been reverted. Most significantly, the output contains a 'post-migration completion plan'. This plan provides a full breakdown of the steps that must be performed to roll the environment back to its original state.
If an event like this appears for a failed migration of a locally hosted virtual machine, you must manually perform the rollback steps on the target server when it next becomes available. It is very important that you ensure that the rollback steps are performed on the systems indicated in the post-migration completion plan. Performing any of these steps on another server could have detrimental effects and could result in virtual machine corruption.
The first step in this plan involves killing the virtual machine on the indicated Oracle VM Server or servers. Depending on the state of the migration at the time that the target Oracle VM Server became unavailable, this may be require an action on either the target Oracle VM Server or both the target and source Oracle VM Servers. In some cases you may not need to perform this action on either Oracle VM Server. The appropriate action is logged in the event description.
During the migration, the virtual machine enters into a paused state as it is copied from the source Oracle VM Server to the target Oracle VM Server. Once the copy is complete, the virtual machine on the target Oracle VM Server is not indicated within Oracle VM Manager in any way, as this would conflict with the virtual machine with the identical UUID that is located on the original source Oracle VM Server. This transition is performed within Oracle VM Manager when the migration is complete. As a result two virtual machines with identical UUIDs may exist within the environment for the period of the migration. If the target server goes offline at any point during the migration, it is frequently the case that at least one of these virtual machines must be killed off to prevent conflict. Since the representation of the virtual machine within Oracle VM Manager is not reliable until the rollback has been completed, it is necessary that you must perform the kill operation directly on the indicated Oracle VM Server. This is usually done over SSH as the root user, using the following command:
ovs-agent-rpc stop_vm "''" "'0004fb0000060000c71d489702c240b3
'" "True"
Note that
0004fb0000060000c71d489702c240b3
should match the UUID of the virtual machine that you were
originally migrating. Also pay attention to the quotes in each
of the arguments presented here. The first argument for this
command is empty, so a pair of single quotes are enclosed in a
pair of double-quotes. The second argument is the UUID of the
virtual machine that you intend to kill and is represented as
enclosed in a pair of single quotes within a pair of
double-quotes. Finally, the last argument is used to force the
action and contains the text True enclosed
in a pair of double-quotes.
Note that you should use this command to stop the virtual machine because it helps to identify the correct virtual machine domain to destroy, it maintains the integrity of your environment and logs any actions carried out on the virtual machine. Do not attempt to use Xen hypervisor tools to perform any actions on the virtual machine directly without explicit instruction from an Oracle Support representative.
A live migration of a virtual machine that is hosted on local storage also requires that any virtual disks are copied from the repository hosted on the source server across to the repository of the target server. Therefore, it is necessary that you manually delete any of these files from the repository hosted on the target Oracle VM Server to clean the environment. To do this, you must SSH to the target Oracle VM Server and delete the files listed in the plan returned in the event description. For example:
rm -f /OVS/Repositories/0004fb0000030000f9ba13d5063e330a/VirtualDisks/0004fb000012000005213553a5bba24f.img rm -f /OVS/Repositories/0004fb0000030000f9ba13d5063e330a/VirtualDisks/tmp_dest.0004fb000012000005213553a5bba24f.img
The virtual machine configuration for the virtual machine is also copied from the repository hosted on the source server across to the repository of the target server during the migration. Therefore, it is necessary that you manually delete any of these files and directories from the repository hosted on the target Oracle VM Server to clean the environment. To do this, you must SSH to the target Oracle VM Server and delete the files listed in the plan returned in the event description. For example:
rm -f /OVS/Repositories/0004fb0000030000f9ba13d5063e330a/VirtualMachines/0004fb0000060000c71d489702c240b3/vm.cfg rm -rf /OVS/Repositories/0004fb0000030000f9ba13d5063e330a/VirtualMachines/0004fb0000060000c71d489702c240b3 rm -f /OVS/Repositories/0004fb0000030000f9ba13d5063e330a/VirtualMachines/tmp_dest.0004fb0000060000c71d489702c240b3
During the migration process, Oracle VM Manager updates its model of the source and target repositories hosted on each Oracle VM Server to match the environment as it would be after the migration is complete. It does not revert this representation unless an automated rollback is achieved. If the rollback has failed and you have performed manual steps to revert your environment to its original state, you must also refresh the repository within Oracle VM Manager so that the model accurately reflects the state of the repository. You can either do this using the Oracle VM Manager Web Interface or you can use the Oracle VM Manager Command Line Interface directly. For example:
ssh -l admin localhost -p 10000 refresh repository name="r216"
At this point, your environment should be completely reverted.
On some hardware, such as the SUN FIRE X4170 M2 Server, migration of very large virtual machines using hardware virtualization can result in a soft lockup causing the virtual machine to become unresponsive. This lock is caused when the migration causes the virtual machine kernel to lose the clock source. Access to the console for the virtual machine shows a series of error messages similar to the following:
BUG: soft lockup - CPU#0 stuck for 315s! [kstop/0:2131]
To resolve this, the virtual machine must be restarted and the
clocksource=jiffies
option should be added
to the HVM guest kernel command line, before rebooting the virtual
machine again.
This option should only be used on HVM guest systems that have actually resulted in a CPU soft lock.
Some devices, such as sound cards, may not work as expected in hardware virtualized guests. In a hardware virtualized guest, a device that requires physical memory addresses instead uses virtualized memory addresses, so incorrect memory location values may be set. This is because DMA (Direct Memory Access) is virtualized in hardware virtualized guest.
Hardware virtualized guest operating systems expect to be loaded in memory starting somewhere around address 0 and upwards. This is only possible for the first hardware virtualized guest loaded. Oracle VM Server virtualizes the memory address to be 0 to the size of allocated memory, but the guest operating system is actually loaded at another memory location. The difference is fixed up in the shadow page table, but the operating system is unaware of this.
For example, a sound is loaded into memory in a hardware virtualized guest running Microsoft Windows™ at an address of 100 MB may produce garbage through the sound card, instead of the intended audio. This is because the sound is actually loaded at 100 MB plus 256 MB. The sound card receives the address of 100 MB, but it is actually at 256 MB.
An IOMMU (Input/Output Memory Management Unit) in the computer's memory management unit would remove this problem as it would take care of mapping virtual addresses to physical addresses, and enable hardware virtualized guests direct access to the hardware.
If you opt to create a PVHVM or PVM, you must ensure that all
disks that the virtual machine is configured to use are configured
as paravirtual devices, or they may not be recognized by the
virtual machine. If you discover that a disk or virtual cdrom
device is not being recognized by your virtual machine, you may
need to edit the vm.cfg
file for the virtual
machine directly. To do this, determine the UUID of the virtual
machine, and then locate the configuration file in the repository,
for example on an Oracle VM Server:
# vi /OVS/Repositories/UUID
/vm.cfg
Locate each disk
entry that contains a hardware
device such as hda
, hdb
, or
hdc
and replace with an xvd
mapping, such as xvda
, xvdb
,
xvdc
etc.
Restart the virtual machine with the new configuration, to check that it is able to discover the disk or virtual cdrom device.
When creating a virtual machine, the following message may be displayed:
Error: There is no server supporting hardware virtualization in the selected server pool.
To resolve this issue, make sure the Oracle VM Server supports hardware virtualization. Follow these steps to check:
Run the following command to check if hardware virtualization is supported by the CPU:
# cat /proc/cpuinfo |grep -E 'vmx|smx'
If any information that contains
vmx
orsmx
is displayed, it means that the CPU supports hardware virtualization. Here is an example of the returned message:flags : fpu tsc msr pae mce cx8 apic mtrr mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
NoteThe /proc/cpuinfo command only shows virtualization capabilities starting with Linux 2.6.15 (Intel®) and Linux 2.6.16 (AMD). Use the uname -r command to query your kernel version.
Make sure you have enabled hardware virtualization in the BIOS.
Run the following command to check if the operating system supports hardware virtualization:
# xm info |grep hvm
The following is an example of the returned message:
xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x
If the CPU does not support hardware virtualization, use the paravirtualized method to create the virtual machine. See the Servers and VMs Tab section in the Oracle VM Manager Online Help for information on creating a paravirtualized virtual machine.
To change the CD in a virtual machine:
Unmount the first CD:
# umount
mount-point
Select the second ISO file, and click Change CD.
Mount the second CD:
# mount /dev/cdrom
mount-point
The Xen hypervisor makes it possible to generate a core dump file
for a virtual machine in the case that it crashes. This file can
be useful for debugging and support purposes. Core dump files can
be large and to avoid overwriting files, each file is named
uniquely. When this facility is enabled, core dump files are saved
to /var/xen/dump
on the Oracle VM Server where the
virtual machine was running when it crashed. This can rapidly use
up available disk space on the dom0 system partition. If you
enable this facility, you must ensure that enough disk space is
available at this path on the Oracle VM Server, either by mounting an
additional disk at this path, or by creating a symbolic link for
this path to point to an alternate location with plenty of
available disk space.
By default, this facility is disable at a system-wide level. It is
possible to change this behavior by editing
/etc/xen/xend-config.sxp
directly and
changing the lines:
# Whether to enable core-dumps when domains crash. #(enable-dump no)
to:
# Whether to enable core-dumps when domains crash. (enable-dump yes)
After making this change, you must reboot the Oracle VM Server for the change to take effect. Manually editing the global Xen configuration parameters on an Oracle VM Server is not supported by Oracle.
It is possible to override the system-wide behavior by setting
this parameter directly in the vm.cfg
for
each individual virtual machine. This is the preferred approach to
generating dump files, as it allows you to limit core dumps to
only those virtual machines that you are interested in debugging.
Therefore, this configuration option can be controlled for each
virtual machine from within Oracle VM Manager. You can set this option by
configuring the Restart Action On Crash
option
for a virtual machine. See the Servers and VMs
Tab section in the Oracle VM Manager Online Help for more information on
this parameter.
If you change the Restart Action On Crash
option
for a virtual machine, you must stop the virtual machine and then
start it again before the change takes effect. This is different
to restarting the virtual machine, as the
vm.cfg
configuration file for the virtual
machine is only read by the Xen hypervisor when the virtual
machine is started. If you have made the configuration change but
have not properly restarted the virtual machine, a crash and
reboot does not automatically cause the configuration option to
take effect.
To test whether or not the core dump facility is working properly for a virtual machine, you may be able to directly trigger a crash by logging into the virtual machine and obtaining root privileges before issuing the following command:
# echo c >/proc/sysrq-trigger
This command assumes that the operating system on the virtual
machine is Linux-based and that the System Request trigger is
enabled within the kernel. After you have triggered the crash,
check /var/xen/dump
on the Oracle VM Server where the
virtual machine was running to view the dump file.
When a virtual machine is hosted in a repository using local storage on the Oracle VM Server where it is running, migration of that virtual machine to another Oracle VM Server and repository requires that I/O on affected disks is not excessively high. If you are running an application that has high I/O during a migration, it may cause the guest or the application to hang. Steps can be taken to mitigate against this on guests that are running a Linux operating system by tuning virtual memory caching parameters within the guest kernel and by reducing the ext4 journaling commit frequency on any guests that may be running file sytems that are formatted with ext4.
Tuning virtual memory caching. On the guest command line, as the root user, you can tune the cache by using the sysctl command to set a number of kernel parameters. Oracle recommends that you reduce the cache size to 5% of the system memory (the default value is 10) and reduce the time that a memory page can remain dirty until it is flushed to around 20 seconds (the default is 30 seconds). You can do this temporarily by running the following commands:
# sysctl -w vm.dirty_background_ratio=5 # sysctl -w vm.dirty_expire_centisecs=2000
Alternatively edit /etc/sysctl.conf
and add
these lines:
vm.dirty_background_ratio=5 vm.dirty_expire_centisecs=2000
When you have done this, you can load these values into the kernel by running sysctl -p.
Tuning ext4 journaling. If the guest is using any filesystems that are formatted to use ext4, the journaling commit process may be affected by a migration. To protect against this, decrease the amount of time between journal commits and ensure that commits are performed asynchronously. To do this, you should tune your mount parameters for any ext4 filesystem that you have mounted within the guest. For example when mounting an ext4 formatted filesystem you might use the following options:
# mount -o commit=5,journal_async_commit /dev/xvdd /vdisk3
To perform this effectively for all ext4 mounts, you may need to
edit your /etc/fstab
.