3 Known Issues
This chapter describes the known issues for the Unbreakable Enterprise Kernel Release 5.
Unusable or Unavailable Features for Arm
This section calls out specific features that are known to not work, remain untested or which are known to have issues that make the feature unusable.
-
InfiniBand
InfiniBand hardware is currently not supported for Arm architecture using UEK R5.
-
FibreChannel
FibreChannel hardware is currently not supported for Arm architecture using UEK R5.
-
RDMA
RDMA and any sub-features are not supported for Arm.
-
OCFS2
The OCFS2 file system and all of the features described in OCFS2 are not supported for Arm.
-
Secure Boot
The Secure Boot feature is currently not supported or available for Arm.
[aarch64] IOMMU issues
Performance issues, such as increased boot times, soft lockups, and crashes can occur on 64-bit Arm (aarch64) architecture that is running UEK R5 when the input–output memory management unit (IOMMU) feature is active. These issues have been observed on some Arm hardware using Mellanox CX-3 and CX-4 cards. However, note that similar issues could occur with different drivers on different hardware.
UEK R5 is configured to use swiotlb
by
default. To enable the use of the IOMMU feature, use
iommu.passthrough=0
on the kernel command
line. (Bug IDs 27687153, 27759954, 27812727, and 27862655)
[aarch64] Kdump issues
Several issues are noted when using Kdump on 64-bit Arm (aarch64) architecture platforms.
-
Kdump fails when using Mellanox ConnectX devices
On systems with Mellanox hardware devices that use either the
mlx4_core
or themlx5_core
driver modules, Kexec fails to load the crash kernel and hangs while themlx4_core
ormlx5_core
driver is initialized.The workaround is to disable loading the driver in the crash kernel by adding either
rd.driver.blacklist=mlx4_core
orrd.driver.blacklist=mlx5_core
to theKDUMP_COMMANDLINE_APPEND
option in/etc/sysconfig/kdump
. Note that this solution is only possible if you have configured to store the vmcore file locally. If Kdump is configured to save the vmcore to a remote host via the device, this workaround fails. (Bug ID 27915989) (Bug ID 27916214) -
Kdump fails when configured to use remote dump target over an ixgbe device
On systems where Kdump is configured to use a remote dump target over an
ixgbe
network device, the Kdump Vmcore Save Service is unable to save the vmcore file to the target destination. (Bug ID 27915827) -
Kdump fails and hangs when configured to use a remote dump target over an igb device
On systems where Kdump is configured to use a remote dump target over an
igb
network device, NETDEV WATCHDOG returns a timeout error and the network adapter is continually reset, resulting in a system hang when kexec attempts to load the crash kernel. (Bug ID 27916095)
[aarch64] CPU hotplug functionality not functional in KVM
Although CPU hotplug functionality is available in QEMU, the aarch64 Linux kernel is not yet able to handle the addition of new virtual CPUs to a running virtual machine. When QEMU is used to add a virtual CPU to a running virtual machine in KVM, and error is returned:
kvm_init_vcpu failed: Device or resource busy
CPU hotplug functionality is currently unavailable for UEK R5 on 64-bit Arm platforms. (Bug ID 28140386)
File System Issues
The following are known issues that are specific to file systems supported with Unbreakable Enterprise Kernel Release 5 Update 2.
ext4: Frequent repeated system shutdowns can cause file system corruption
If a system using ext4
is repeatedly and
frequently shut down, the file system may be corrupted. This
issue is considered to be a corner-case due to the difficulty
required to replicate. The issue exists in upstream code and
proposed patches are currently under review. (Bug ID 27547113)
xfs: xfs_repair fails to repair the corrupted link counts
If an xfs
file system is repaired by using
the xfs_repair command, and there are
invalid inode counts, the utility may fail to repair the
corrupted link counts and return errors while verifying the
link counts. The issue is currently under investigation, but
appears to be related to the
xfsprogs-4.15-1
package released with
UEK R5. The issue may not appear when using the earlier
xfsprogs-4.5.0-18.0.1
version of this
package available in the ol7_latest
yum
repository. (Bug ID 28070680)
RDMA Issues
The following issues are noted for RDMA:
-
ibacm service is disabled by default
The
ibacm
service is disabled by default immediately after installation. This means that theibacm
service does not automatically start after a reboot. This is intended behavior. Requirements to use theibacm
service are application-specific. If your application requires this service, you may need to enable the service to start after reboot:# systemctl enable ibacm
(Bug ID 28074471)
-
Error, some other host already uses address xxx.xxx.xxx.xxx
The following error message might be triggered in certain instances:
Error, some other host already uses address xxx.xxx.xxx.xxx
This issue is typically triggered when active-bonding is enabled, and you run the
ifup
ib-interface command.You can ignore this message, as the InfiniBand interface is brought up successfully. (Bug ID 28097516)
Docker Issues
The following are known Docker issues:
-
Running yum install within a container on an overlayfs file system can fail with the following error:
Rpmdb checksum is invalid: dCDPT(pkg checksums): package_name
This error can break Dockerfile builds but is expected behavior from the kernel and is a known issue upstream (see https://github.com/docker/docker/issues/10180.)
The workaround is to run touch /var/lib/rpm/* before installing the package.
Note that this issue is fixed in any Oracle Linux images available on the Docker Hub or Oracle Container Registry, but the issue could still be encountered when running any container based on a third-party image. (Bug ID 21804564)
-
Docker can fail where it uses the overlay2 storage driver on XFS-formatted storage
A kernel patch has been applied to prevent overlay mounts on XFS if the
ftype
is not set to 1. This fix resolves an issue where XFS did not properly support the whiteout features of an overlay file system ifd_type
support was not enabled. If the Docker Engine is already using XFS-formatted storage with theoverlay2
storage driver, an upgrade of the kernel can cause Docker to fail if the underlying XFS file system is not created with the-n ftype=1
option enabled. The root partition on Oracle Linux 7 is automatically formatted with-n ftype=0
where XFS is selected as the file system. Therefore, if you intend to use theoverlay2
storage driver in this environment, you must format a separate device for this purpose. (Bug ID 25995797)
IOMMU kernel option enabled by default
Starting with UEK R5U1, IOMMU functionality is enabled by
default in the x86_64
kernel. This change
better facilitates single root input-output virtualization
(SR-IOV) and other virtualization extensions; but, is also known
to result in boot failure issues on certain hardware that cannot
complete discovery when IOMMU is enabled. The status of this
feature no longer appears in /proc/cmd
reporting as iommu=on
and may need to be
explicitly disabled as a kernel cmdline
option if boot failure occurs. As an alternate workaround, you
can disable IOMMU or Intel-Vtd in your system ROM by following
your vendor instructions.
These boot failure issues have been observed on equipment with certain Broadcom network devices, such HP Gen8 servers. For more detailed information, see https://support.hpe.com/hpsc/doc/public/display?docId=emr_na-c04565693 .
LXC Issues
The following are known LXC issues:
-
LXC read-only ip_local_port_range parameter
With
lxc-1.1
or later and UEK R5,ip_local_port_range
is a read-writable parameter under/proc/sys/net/ipv4
in an Oracle Linux container rather than being read-only. (Bug ID 21880467)
NVMe device names change across reboots
Since UEK R5 adds support for NVMe subsystems and multipathing, enumerated device names generated by the kernel are not stable. This is similar to the way that other block devices are handled by the kernel. If you use enumerated kernel instance names to handle mounts in your fstab, the mounts may fail or behave unpredictably.
Never use enumerated kernel instance names when referring to block devices. Instead, use the UUID, partition label or file system label to refer to any block device, including an NVMe device. If you are uncertain of the device UUID or labels, use the blkid command to view this information.
Prior to multipathing, a subsystem number would typically map
onto the controller number. Therefore, you could assume that the
subsystem at /dev/nvme0n1
was affiliated with
controller /dev/nvme0
. This is no longer the
case. For multipathing to be enabled a subsystem could have
multiple controllers. In this case,
/dev/nvme0n1
could just as easily be
affiliated with controllers at /dev/nvme1
and
/dev/nvme2
. There is no specific correlation
between the subsystem device name and the controller device
name.
NVMe device hotplug unplug procedure change
Since UEK R5 adds support for NVMe subsystems and multipathing, enumerated device names generated by the kernel are not stable. This means that the procedure for identifying and unplugging NVMe devices using hotplug functionality is slightly different to the procedure that you may have followed using other kernel releases. This note describes the steps that you should take to identify, power down and unplug the appropriate device.
-
Use the lsblk command to identify the disk that you wish to remove according to its WWN or UUID. For example:
# lsblk -o +UUID,WWN,MODEL
Take note of the enumerated kernel instance name that has been assigned to the device. For example, this may be
nvme0n1
. It is very important to understand that the device name does not necessarily map onto the controller or PCIe bridge that it is attached to. See NVMe device names change across reboots for more information. -
Search for the device path to obtain the PCI domain identifier for the device:
# find /sys/devices -iname nvme0n1 /sys/devices/pci0000:85/0000:85:01.0/0000:8d:00.0/nvme/nvme1/nvme0n1
Note that
0000:8d:00.0
in the returned path for the device is the PCI domain identifier for the device. You need this information to proceed. -
Obtain the physical slot number for the NVMe drive. Under UEK R5, the slot is bound to the NVMe device directly and not to the PCIe controller. You can find the slot number for the NVMe device by running the lspci command and querying the PCI domain identifier for the device in verbose mode:
# lspci -s 0000:8d:00.0 -vvv 8d:00.0 Non-Volatile memory controller: Intel Corporation Express Flash NVMe P4500 (prog-if 02 [NVM Express]) Subsystem: Oracle/SUN Device 4871 Physical Slot: 104-1 …
Note that the Physical Slot number for the device in this example is
104-1
. Take note of this value to proceed. -
Use the Physical Slot number for the device to find its bus interface:
# find /sys -iname "104-1" /sys/bus/pci/slots/104-1
-
Use the returned bus interface path to power off the NVMe drive:
# echo 0 > /sys/bus/pci/slots/104-1/power
Depending on your hardware, the blue disk LED may display on the front panel of the system may display to indicate that you can safely remove the disk drive.
KVM guest crashes when using memory hotplug operation to shrink available memory
A KVM guest may crash if the guest memory is reduced from 96GB or more to 2GB using a memory hotplug operation. Although this issue is logged for UEK R5, similar issues have been noted for RHCK. The issue is expected behavior and relates to the how memory ballooning works. Shrinking guest memory in large amounts can result in Out Of Memory (OOM) conditions and processes are killed automatically, if the memory shrinks to below the amount that is in use by the guest operating system at the time. (Bug ID 27968656)
Kernel warning when allocating memory for Avago MegaRAID SAS 9460-16i controller
An issue that causes a kernel warning when loading the
megaraid_sas
module for the Avago MegaRAID
SAS 9460-16i controller is introduced in this kernel release.
The issue results when the kernel attempts to allocate memory
for the IO request frame pool.
The issue is resolved by setting the contiguous memory
allocation (cma) value to 64M at boot, by editing the
/etc/defaults/grub
file to update the
GRUB_CMDLINE_LINUX line to include the option
cma=64M
. For example:
GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=ol7/root rd.lvm.lv=ol7/swap rhgb quiet cma=64M"
(Bug ID 29635963, 29618702)