3 Known Issues

This chapter describes the known issues for the Unbreakable Enterprise Kernel Release 6.

Unusable or Unavailable Arm Features

The following features are known to not work, remain untested, or have issues that cause the feature to be unusable or unavailable on the 64-bit Arm (aarch64) platform:

  • InfiniBand

    InfiniBand hardware is currently not supported for Arm architecture using UEK R6.

  • FibreChannel

    FibreChannel hardware is currently not supported for Arm architecture using UEK R6.

  • RDMA

    RDMA and any of its subfeatures are not supported for the Arm architecture.

  • Secure Boot and Lockdown

    The Secure Boot feature and the Kernel Lockdown functionality are not supported or available for the Arm architecture.

Serial port console can crash if the serial port baud rate is too low

On systems that use a physical serial console to monitor system output, such as on an ILOM console interface, it is possible that high levels of output can introduce abnormal system behavior such as kernel deadman timer events that indicate processes are unable to obtain CPU scheduler time. This is typically experienced if the serial console speed is set too low and a log level of 6 or higher is configured for the system. To reduce the likelihood of this issue occurring, either reduce the log level or configure the console for the maximum possible baud rate, 115200.

Starting with UEK R6U1, a warning is displayed in the dmesg output if the baud rate is set too low:

dmesg | grep -A4 -i baud
[  369.777802] Serial console is set to the default of 9600 baud. This can
[  369.778852] result in stalls or lockups in error conditions requiring a
[  369.779892] large number of console system messages. Please increase the
[  369.780889] rate to the highest your system will allow (for instance,
115200
[  369.781918] or 57600). See Oracle KM Note 2648582.1 for more information.

The current console speed for a running Oracle Linux 7 or Oracle Linux 8 system can be set for a configured serial port by running:

stty -F /dev/ttyS0 speed 115200

To change the serial console speed that is used when the system boots, you must edit the GRUB configuration. Edit /etc/sysconfig/grub in a text editor and append console=ttyS0,115200 to the line starting with GRUB_CMDLINE_LINUX, for example:

GRUB_CMDLINE_LINUX="crashkernel=auto resume=/dev/mapper/linux1-swap rd.lvm.lv=linux1/root \
  rd.lvm.lv=linux1/swap rhgb quiet console=ttyS0,115200"

Note that in the above examples, the serial console is assumed to be ttyS0, you may need to change this if you have used an alternate serial port.

To update your grub configuration with the changes so that they are used on the next boot if you are using legacy BIOS, run:

sudo grub2-mkconfig -o /boot/grub2/grub.cfg

Alternately, if you are booting by using the Unified Extensible Firmware Interface (UEFI), run the following command:

sudo grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg

If you are using Oracle Server hardware, or a system that provides an ILOM interface to the serial console, make sure that you update the serial console configuration on the ILOM to match the speed that you have set within the host operating system. You can set the serial port on the ILOM CLI by running:

sudo set /SP/serial/host pendingspeed=115200 commitpending=true

To check the current console port speed on the ILOM, using the CLI, run:

sudo show /SP/serial/host

For more information about ILOM configuration, see https://docs.oracle.com/cd/E19203-01/820-1188-12/core_ilom_managing.html.

(Bug ID 30487830, 30439170)

SELinux "Permission watch" messages displayed

Booting UEK R6 in either the SELinux permissive mode or the enforcing mode produces messages similar to the following:

SELinux:  Permission watch in class filesystem not defined in policy. 
SELinux:  Permission watch in class file not defined in policy. 
SELinux:  Permission watch_mount in class file not defined in policy. 
SELinux:  Permission watch_sb in class file not defined in policy.
SELinux: the above unknown classes and permissions will be allowed

These messages are displayed because no definitions currently exist for these classes in SELinux policy. Per the last line of the message, classes and permissions are allowed by default; and therefore, the messages can be safely ignored.

(Bug ID 30687021, 30687021)

SELinux in enforcing mode with the MLS policy not supported

When SELinux is configured to use the Multilevel Security (MLS) policy and it is in the enforcing mode, several issues can prevent normal functioning of the operating system, including permissions errors when attempting to mount file systems and the likelihood of a Systemd freeze when booting the operating system.

SELinux in the enforcing mode with the MLS policy is not supported. Note that you can continue to use SELinux in the enforcing mode by using the targeted policy.

(Bug ID 30797389, 30609238)

Spurious xs_tcp_setup_socket: connect messages when using NFS

When using NFS, inaccurate messages regarding socket connection errors may be emitted. Messages may appear as follows:

xs_tcp_setup_socket: connect returned unhandled error -107

The underlying connection issue is resolved and any connections that fail are now automatically reopened. Provided no associated functional impact is experienced, this error message may be ignored. Note that this message may also appear as a result of a genuine ongoing connection issue.

(Bug ID 30339848)

mstlink command crashes with core dump when used on Oracle Linux 8

The mstlink command crashes when run on an Oracle Linux 8 system running Unbreakable Enterprise Kernel Release 6. The following output is typical:

sudo mstlink -d 13:00.1
/usr/include/c++/8/bits/stl_vector.h:932: std::vector<_Tp, _Alloc>::reference
std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type)
[with _Tp = unsigned int; _Alloc = std::allocator<unsigned int>;
std::vector< Tp, _Alloc>::reference = unsigned int& std::vector<_Tp,
_Alloc>::size_type = long unsigned int]: Assertion '__builtin_expect(__n <
this->size(), true)' failed.
Aborted (core dumped)

This issue is related to system-wide hardening changes introduced upstream and present in Oracle Linux 8. The upstream tools in the mstflint package, including mstlink do not adequately cater for these hardening changes. Alternate tools can be used to gather and configure link information, including ip link, ethtool, ifstat, and ibv_devinfo.

(Bug ID 30993407)

IOMMU kernel option enabled by default

Starting with UEK R5U1, IOMMU functionality is enabled by default in the x86_64 kernel. This change better facilitates single root input-output virtualization (SR-IOV) and other virtualization extensions; however, it is also known to result in boot failure issues on certain hardware that cannot complete discovery when IOMMU is enabled. The status of this feature no longer appears in /proc/cmd reporting as iommu=on, which means it may need to be explicitly disabled as a kernel cmdline option if boot failure occurs. As an alternative workaround, you can disable IOMMU or Intel-Vtd in your system ROM by following your vendor instructions.

These boot failure issues have been observed on equipment with certain Broadcom network devices, such HP Gen8 servers. For more detailed information, see https://support.hpe.com/hpsc/doc/public/display?docId=emr_na-c04565693.

PCIE hot-plug driver error for virtual machines running on Arm platforms

The PCIE hot-plug driver emits an error message when a virtual machine running on an Arm platform is rebooted. The error emitted is similar to the following message:

[    3.574244] pcieport 0000:00:02.1: pciehp: Failed to check link status

The issue is not replicated on bare metal systems.

(Bug ID 30512596)

(aarch64) Perf tool can result in application slowdown when profiling some virtualized Arm platforms

Note:

The following issue does not affect bare metal installations.

On virtual machines (VMs) that are running on a multi-socket aarch64 platform, if the perf top or perf record command is invoked, it is possible that application slowdowns may occur. In certain cases, the following message is emitted in a terminal window:

kernel:watchdog: BUG: soft lockup

You can mitigate this problem as follows:

  • To avoid lockup situations and reduce probe effect, you can specify a sample period by using the -c flag with the perf record command, rather than a frequency by using the -F flag. For example, you would use the perf record -c command instead of the perf record -F 100 command.

  • Do not use the perf command with the --all-cpus flag. Instead, specify a minimal number of CPUs by using the perf -C command.

(Bug ID 32834324)

Messages emitted indicating the route cache is full when using IPv6

On some systems, error messages indicating that the route cache is full, are emitted when using IPv6. An error similar to the following example may be returned:

[ 5523.456447] Route cache is full: consider increasing sysctl
net.ipv[4|6].route.max_size.

It is unclear what causes these errors or to what size /proc/sys/net/ipv6/route/max_size should be increased; but, on a test system, the issue could not be replicated after running the following command:

sudo sysctl net.ipv6.route.max_size=32768

Because the issue is currently under investigation, increasing this value is a viable workaround.

(Bug ID 30976607)

IPv6 failback fails when using RoCE

The rdmaip driver does not send IPv6 address change notification to RDS, which can delay or prevent IPv6 fail over when using RoCE. This is apparent when active bonding is enabled and only occurs for IPv6. The IPv4 failover continues to work correctly.

When the issue is triggered, the following messages may appear in the kernel log:

kernel: rdmaip: could not add 2001:db8:0:f101::50%4/64 to ens2f0 (port 1)
kernel: IPv6: ens2f0: IPv6 duplicate address 2001:db8:0:f101::50 used by 
        50:6b:4b:cb:ef:23 detected!

A fix is in development but is not available at the time of this release. The fix may become available as an errata update.

(Bug ID 31021418)

It is not possible to remove the libpcap package

Attempting to remove the libpcap package or performing an action that would attempt to remove the package results in an error because the dependency chain would require the removal of the systemd package and this would break the system.

This is expected behavior in Oracle Linux 8; however, the behavior is mentioned here because in previous Oracle Linux releases, it was possible to remove the libpcap package

In some circumstances, such as when installing the RDMA packages, libpcap may be upgraded to a newer version than the version provided for the operating system. If you remove these packages, you may wish to also downgrade the libpcap package to match the highest version provided for the operating system in the BaseOS channel or repository. Typically, this might be most easily done by reverting the installation using the dnf history undo command. See the DNF(8) manual page for more information.

(Bug ID 30979601)

Early microcode loading

When booting an Oracle Linux 7 bare-metal system with UEK R6, the following may be reported in the dmesg log:

This kernel doesn't handle early microcode load properly (it tries to load
microcode even in virtualised environment, which may lead to a panic on some
hypervisors), thus the microcode files have not been added to the initramfs
image.        

UEK R6 does, in fact, handle late microcode loading properly. The messages are due to a downrev microcode-ctl user space package that does not recognize the UEK R6 kernel version.

This issue is fixed in the microcode_ctl-2.1-61.10.0.1 package or later versions.

(Bug ID 31085618)

Reload of lpfc driver emits error messages

Error messages, similar to the following, may be reported when the Broadcom Emulex LightPulse Fibre Channel SCSI driver, lpfc, is unloaded and reloaded:

bmx048-ps kernel: lpfc 0000:13:00.1: 1:(0):2858 FLOGI failure
Status:x9/x30000 TMO:x14 Data x101800 x0
bmx048-ps kernel: lpfc 0000:13:00.1: 1:(0):0820 FLOGI Failed (x300). BBCredit
Not Supported
bmx048-ps kernel: lpfc 0000:13:00.0: 0:(0):2858 FLOGI failure
Status:x9/x30000 TMO:x14 Data x101800 x0
bmx048-ps kernel: lpfc 0000:13:00.0: 0:(0):0820 FLOGI Failed (x300). BBCredit
Not Supported
      

These notices can be safely ignored, provided the devices are properly found after the lpfc module reload completes.

(Bug ID 31598148)

Network latency may increase on Infiniband fabrics

If the TCP write size is close to the size of the Infiniband (IB) Maximum Transmission Unit (MTU), applications may experience higher latencies on packet transfers. For example, the default IB MTU is 65520 bytes. An application that also uses a TCP write size between 65520 bytes to 128 KB may experience this issue. The issue does not appear when applications use larger (for example, 256 KB) or smaller (for example, 4 KB or 32 KB) TCP write sizes.

Note that Ethernet networks are not affected by this issue.

The default values for the IB MTU and TCP write sizes in Oracle Linux and UEK R6 do not expose the issue. Applications with modified TCP window sizes, or systems with modified MTU values, could overlap and expose this issue.

The workaround for this issue is to tune either the MTU of the IB interface, or the TCP write size of the application, so that the TCP write size is smaller than the IB MTU or the TCP write size is greater than 2x the IB MTU. You can tune MTUs dynamically by using the ip link command. Note that tuning of the TCP write size is application specific.

(Bug ID 31830430)

(aarch64) Kdump fails to allocate crashkernel memory on some Arm systems

On some 64-bit Arm (aarch64) systems, where insufficient low contiguous memory is available, Kdump may fail due to the system's inability to allocate the minimum crashkernel memory that is typically reserved when the auto value is set.

This issue results in Kdump failing to start and the following errors appearing in the logs:

kdumpctl[3812]: No memory reserved for crash kernel
...
systemd[1]: Failed to start Crash recovery kernel arming.

To work around this issue, manually set the crashkernel low and high values and attempt to set a low value that is below 256 MB. For example, replace crashkernel=auto with crashkernel=800M,high crashkernel=200M,low.

(Bug ID 31554906)