3 Known Issues
This chapter describes the known issues in this update.
btrfs, ext4 and xfs: Kernel panic when freeze and unfreeze operations are performed in multiple threads
Freeze and unfreeze operations that are performed across multiple threads on any supported file system can cause the system to hang and the kernel to panic. This problem is the result of a race condition that occurs when the unfreeze operation is triggered before it is actually frozen. The resulting unlock operation attempts a write operation on a non-existent lock, resulting in the kernel panic. (Bug ID 25321899)
btrfs
The following are known btrfs
issues:
-
Send operation causes soft lockup on large deduped file
Using
btrfs send
on a large deduped file results in a soft lockup or out-of-memory issue. This problem occurs because thebtrfs send
operation cannot handle a large deduped file containing file extents that are all pointing to one extent, as these types of file structures create tremendous pressure for thebtrfs send
operation.To prevent this issue from occurring, do not use
btrfs send
on systems with less than 4 GB of memory. (Bug ID 25306023) -
Kernel oops when unmounting during a quota rescan or disable
Operations that trigger a quota rescan or to disable the quota on a mounted file system cause a kernel oops message when attempting to unmount the file system. This can cause the system to hang. (Bug ID 22377928)
-
Kernel oops when removing shared extents using qgroup accounting
The removal of shared extents where quota group (qgroup) accounting is used can result in a kernel oops message. This relates to an issue where inaccurate results are obtained during a back reference walk, due to missing records when adding delayed references. (Bug ID 21554517)
-
No warning when balancing file system on RAID
The btrfs filesystem balance command does not warn that the RAID level can be changed under certain circumstances, and does not provide the choice of cancelling the operation. (Bug ID 16472824)
-
Disk space requirement to perform all btrfs operations
The copy-on-write nature of
btrfs
means that every operation on the file system initially requires disk space. It is possible that you cannot execute any operation on a disk that has no space left, and even removing a file might not be possible. In the case that there is no space to store metadata, anENOSPC
error is returned. In this situation, run sync before retrying an operation, as this can clear a background writeback that might be reserving metadata space. Another potential workaround is to add a disk or a file backed loop device using the btrfs device add command. The mechanism that is used to store data and metadata might lead to some confusion on the information returned by tools like df. Sometimes, metadata might fill all of the disk space allocated for this purpose, even while there is still space available for data. In this case, the file system is unbalanced and the problem can be resolved by performing a btrfs fi balance operation. See https://btrfs.wiki.kernel.org/index.php/Problem_FAQ#I_get_.22No_space_left_on_device.22_errors.2C_but_df_says_I.27ve_got_lots_of_space. -
Double count of overwritten space in qgroup show
When you overwrite data in a file, starting somewhere in the middle of the file, the overwritten space is counted twice in the space usage numbers that btrfs qgroup show displays. Using the btrfs quota rescan does not help fix this issue either. (Bug ID 16609467)
-
Sector size should match page size
If you use the -s option to specify a sector size to mkfs.btrfs that is different from the page size, the created file system cannot be mounted. By default, the sector size is set to be the same as the page size. (Bug ID 17087232)
-
Location of
btrfs-progs
andbtrfs-progs-devel
packagesThe
btrfs-progs
andbtrfs-progs-devel
packages for use with UEK R4 are made available in theol6_x86_64_UEKR4
andol7_x86_64_UEKR4
ULN channels and theol6_UEKR4
andol7_UEKR4
channels on the Oracle Linux Yum Server. In UEK R3, these packages were made available in theol6_x86_64_latest
andol7_x86_64_latest
ULN channels and theol6_latest
andol7_latest
channels on the Oracle Linux Yum Server.
ext4
The following are known ext4
issues:
-
System hangs when processing corrupted orphaned inode list
If the orphaned inode list is corrupted, the inode might be processed repeatedly, resulting in a system hang. For example, if the orphaned inode list contains a reference to the bootloader inode,
ext4_iget()
, returns a bad inode, it can result in a processing loop that can hang the system. (Bug ID 24433290) -
System hangs on unmount after an append to a file with negative
i_size
While it is invalid for a file system to load an inode with a negative
i_size
, it is possible to create a file like this and append to it. However, doing so causes an integer overflow in the routine's underlying writeback, resulting in the kernel locking up. (Bug ID 25565527)
xfs
The following are known xfs
issues:
-
File system corruption occurs after direct I/O writes
A race condition that results in post-eof blocks being used for direct I/O writes causes a corruption in the file system. If a file release occurs during a file extending direct I/O write, it is possible to mistake the post-eof blocks for speculative preallocation and incorrectly truncate them from the inode. This issue is unlikely to be reproduced in real-world workloads. (Bug ID 26128822)
-
Invalid corrupted file system error resulting from a problem with log recovery on v5 superblocks
A problem with log recovery on v5 superblocks that causes the metadata LSN not to update for buffers that it writes out, can result in a corruption error similar to the following:
[1044224.901444] XFS (sdc1): Metadata corruption detected at xfs_dir3_block_write_verify+0xfd/0x110 [xfs], block 0x1004e90 [1044224.901446] XFS (sdc1): Unmount and run xfs_repair ... [1044224.901460] XFS (sdc1): xfs_do_force_shutdown(0x8) called from line 1249 of file fs/xfs/xfs_buf.c. Return address = 0xffffffffa07a8910 [1044224.901462] XFS (sdc1): Corruption of in-memory data detected. Shutting down filesystem [1044224.901463] XFS (sdc1): Please umount the filesystem and rectify the problem(s) [1044224.904207] XFS (sdc1): log mount/recovery failed: error -117 [1044224.904456] XFS (sdc1): log mount failed"
This problem is encountered because the log attempts to replay a buffer update that is no longer valid due to subsequent replayed updates. The result is a corruption error, when in fact, the file system is fine. (Bug ID 25380003)
-
System hangs on unmount after a buffered append to a file with negative
i_size
While it is invalid for a file system to load an inode with a negative
i_size
, it is possible to create a file like this, and in the case where a buffer appends to it, an integer overflow in the routine's underlying writeback results in the kernel locking up. A direct append does not cause this behavior. (Bug ID 25565490) -
System hangs during
xfs_fsr
on two-extent files with speculative preallocationDuring an
xfs_fsr
process on extents that are generated by speculative preallocation, the code that determines whether all of the extents fit inline miscalculates because thedi_nextents
call that is used does not account for these extents. This results in corruption of the in-memory inode, and ultimately the code attempts to move memory structures using incorrectly calculated ranges. This causes a kernel panic. (Bug ID 25333211) -
XFS quotas are disabled after a read-only remount on Oracle Linux 6
Quotas are disabled on XFS if the file system is remounted with read-only permissions on Oracle Linux 6. (Bug ID 22908906)
-
Overlay file system is unable to mount on XFS where there is no
d_type
supportOverlay file systems rely on a feature known as
d_type
support. This feature is a field within a data structure that provides some metadata about files in a directory entry within the base file system. Overlay file systems use this field to track many file operations such as file ownership changes and whiteouts.d_type
support can be enabled in XFS when the file system is created, by using the-n ftype=1
option. Whend_type
support is not enabled, an overlay file system might become corrupt and behave in unexpected ways. For this reason, this update release of UEK R4 prevents the mounting of an overlay file system on an XFS base, whered_type
support is not enabled.The
root
partition on Oracle Linux is automatically formatted with-n ftype=0
, where XFS is selected as the file system. Thus, for backward compatibility reasons, if you have overlay file systems in place already and these are not hosted on alternate storage, you must migrate them to a file system that is formatted withd_type
support enabled.To check that the XFS file system is formatted correctly:
# xfs_info /dev/sdb1 |grep ftype
Replace /dev/sdb1 with the path to the correct storage device. If the information returned by this command includes
ftype=0
, you must migrate the overlay data held in this directory to storage that is formatted correctly.To correctly format a new block device with the XFS file system with support for overlay file systems, do:
# mkfs -t xfs -n ftype=1 /dev/sdb1
Replace /dev/sdb1 with the path to the correct storage device. It is essential that you use the
-n ftype=1
option when you create the file system.If you do not have additional block storage available, it is possible to create an XFS file system image and loopback that can be mounted. For example, to create a 5 GB image file in the
root
directory, you could use the following command:# mkfs.xfs -d file=1,name=/OverlayStorage,size=5g -n ftype=1
To temporarily mount this file, you can enter:
# mount -o loop -t xfs /OverlayStorage /mnt
Adding an entry in
/etc/fstab
to make a permanent mount for this storage, might look similar to the following:/OverlayStorage /mnt xfs loop 0 0
This configuration can help as a temporary solution to solve upgrade issues. However, using a loopback mounted file system image as a form of permanent storage is not recommended for production environments. (Bug ID 26165630)
DIF/DIX is not supported for ext file systems
The Data Integrity Field (DIF) and Data Integrity Extension (DIX) features that have been added to the SCSI standard are dependent on a file system that is capable of correctly handling attempts by the memory management system to change data in the buffer while it is queued for a write.
The ext2, ext3 and ext4 file system drivers do not prevent pages from being modified during I/O which can cause checksum failures and a "Logical block guard check failed" error. Other file systems such as XFS are supported. (Bug ID 24361968)
Console appears to hang when booting
When booting Oracle Linux 6 on hardware with an ASPEED graphics controller, the console
might appear to hang during the boot process after starting udev
. However,
the system does boot properly and is accessible. The workaround is to add
nomodeset
as a kernel boot parameter in /etc/grub.conf
.
(Bug ID 22389972)
Docker
The following are known Docker issues:
-
Running yum install within a container on an overlayfs file system can fail with the following error:
Rpmdb checksum is invalid: dCDPT(pkg checksums): package_name
This error can break Dockerfile builds but is expected behavior from the kernel and is a known issue upstream (see https://github.com/docker/docker/issues/10180.)
The workaround is to run touch /var/lib/rpm/* before installing the package.
Note that this issue is fixed in any Oracle Linux images available on the Docker Hub or Oracle Container Registry, but the issue could still be encountered when running any container based on a third-party image. (Bug ID 21804564)
-
Docker can fail where it uses the
overlay2
storage driver on XFS-formatted storageA kernel patch has been applied to prevent overlay mounts on XFS if the
ftype
is not set to 1. This fix resolves an issue where XFS did not properly support the whiteout features of an overlay filesystem ifd_type
support was not enabled. If the Docker Engine is already using XFS-formatted storage with theoverlay2
storage driver, an upgrade of the kernel can cause Docker to fail if the underlying XFS file system is not created with the-n ftype=1
option enabled. The root partition on Oracle Linux 7 is automatically formatted with-n ftype=0
where XFS is selected as the file system. Therefore, if you intend to use theoverlay2
storage driver in this environment, you must format a separate device for this purpose. (Bug ID 25995797) -
Docker can fail where it uses the
overlay2
storage driver and SELinux is enabledIf the Docker Engine is configured to use the
overlay2
storage driver and SELinux is enabled and set to Enforcing mode, Docker containers are unable to function properly and permissions errors are encountered. If you intend to use Docker with theoverlay2
storage driver, you must set SELinux to Permissive mode. (Bug ID 25684456)
DTrace
The following are known DTrace issues:
-
Argument declarations with USDT probe definitions cannot be declared with derived types such as
enum
,struct
, orunion
. -
The following compiler warning can be ignored for USDT probe definition arguments of type
string
(which is a D type but not a C type):provider_def.h:line#: warning: parameter names (without types) in function declaration
-
-
Multi-threaded processes under
ustack()
,usym()
,uaddr()
andumod()
, which performdlopen()
in threads other than the first thread might not have accurate symbol resolution for symbols introduced bydlopen()
. (Bug ID 20045149)
Error, some other host already uses address xxx.xxx.xxx.xxx
The following error message might be triggered in certain instances:
Error, some other host already uses address xxx.xxx.xxx.xxx
The following are the two instances in which this error message might be triggered:
-
When active-bonding is enabled, and you run the
ifup
ib-interface command. -
When you run the
service rdma start
command.
You can ignore this message, as in both cases, the InfiniBand interface is brought up successfully. (Bug IDs 21052903, 26639723)
ifup-ib: line 357: /sys/class/net/ib0/acl_enabled: Permission denied error
Running ifup
ib-interface or service
network restart
reports the following error:
/etc/sysconfig/network-scripts/ifup-ib: line 357: /sys/class/net/ib0/acl_enabled: Permission denied
This error is reported, even though the InfiniBand interface is brought up successfully.
The workaround for this issue is to change from using the older
configuration method, where you manipulate
sysfs
files to the newer
ibacl
tools that are provided. (Bug ID
26197105)
Increased dom0 memory requirement when using Mellanox® HCAs on Oracle VM Server
Oracle VM Servers running UEKR4u2 and upward in
dom0
require at least 400MB more memory to
use the Mellanox® drivers. This memory requirement is a
result of the default size of the SRQ count being increased from
64K to 256K in later versions of the kernel and the
scale_profile
option is now enabled by
default in the mlx_core
module.
In the case where out-of-memory errors are observed in dom0
, the maximum
dom0
memory size should be increased. Alternative workarounds might involve
manually setting the module parameters for the mlx4_core
driver. To set these
parameters, edit /etc/modprobe.d/mlx4_core.conf
and set
scale_profile
to 0
. Alternately, set
log_num_srq
to 16
. The preferred resolution to this issue
is to increase the memory allocated to dom0
on an Oracle VM Server. (Bug ID
23581534)
LXC
The following are known LXC issues:
-
The
lxc-net
service does not always start immediately after installation on Oracle Linux 6The
lxc-net
service does not always start immediately after installation on Oracle Linux 6, even though this action is specified as part of the RPM post-installation script. This can prevent thelxcbr0
interface from coming up. If this interface is not up after installation, you can manually start it by running service lxc-net start. (Bug ID 23177405) -
LXC read-only
ip_local_port_range
parameterWith
lxc-1.1
or later and UEK R4,ip_local_port_range
is a read-writable parameter under/proc/sys/net/ipv4
in an Oracle Linux container rather than being read-only. (Bug ID 21880467)
MSI-X interrupt allocation fails during maximum number of
ixgbe
/ixgbevf
Virtual
Function creation
The Intel ixgbe
/ixgbevf
and Qlogic qla2xxx
drivers compete for MSI-X
resources. As a result, if both drivers are used in a system,
and an attempt is made to create the maximum number of Virtual
Function (VF) devices that are allowed for the
ixgbe
/ixgbevf
driver, an
interrupt allocation failure occurs during the creation of the
last VF device.
Note that you can create and use up to, but not including, the
maximum number of VF devices that are allowed for the
ixgbe
/ixgbevf
without
encountering this issue. (Bug ID 25952728)
NVMe devices not found under the /dev
directory after
PCI rescan
After removing the PCI bus of NVM Express (NVMe) adapter card
devices and running a rescan of the PCI bus, no NVMe adapter
card devices are found under the /dev
directory.
The workaround for this issue is to also remove the PCI slot that the NVMe adapter card device is plugged into before running a rescan of the PCI bus. (Bug ID 26610285)
OFED iSER target login fails from an initiator on Oracle Linux 6
An Oracle Linux 6 system with the
oracle-ofed-release
packages installed and an
iSER (iSCSI Extensions for RDMA) target configured, fails to
login to the iSER target as an initiator. On the Oracle Linux 6
initiator machine, the following behavior is typical:
# iscsiadm -m node -T iqn.iser-target.t1 -p 10.196.100.134 --login Logging in to [iface: default, target: iqn.iser-target.t1, portal: 10.196.100.134,3260] (multiple) iscsiadm: Could not login to [iface: default, target: iqn.iser-target.t1, portal: 10.196.100.134,3260]. iscsiadm: initiator reported error (8 - connection timed out) iscsiadm: Could not log into all portals
This is expected behavior resulting from an errata fix for CVE-2016-4564, to protect against a write from an invalid context.
(Bug ID 23615903)
Open File Description (OFD) locks are not supported on NFSv4 mounts
NFS is not designed to handle OFD locking. (Bug ID 22948696).
SDP performance degradation
The Sockets Direct Protocol (SDP), which was designed to provide an RDMA alternative to TCP over InfiniBand networks, is known to suffer from performance degradation on more recent kernels such as UEK R4u2 and later. There is no active development on this protocol.
Although the library for this protocol is still available for this kernel, support is limited. You should consider using TCP on top of IP over InfiniBand as a more stable alternative. (Bug ID 22354885)
Shared Receive Queue (SRQ) is an experimental feature for RDS and is disabled by default
The SRQ function that optimizes resource usage within the rds_rdma
module
is experimental and is disabled by default. A warning message is displayed when you enable
this feature by setting the rds_ib_srq_enabled
flag. (Bug ID 23523586).
Unloading or removing the rds_rdma
module is
unsupported
Once the rds_rdma
module has been loaded, you cannot remove the module
using either rmmod or modprobe -r. Unloading of
the rds_rdma
module is unsupported and can trigger a kernel panic. Do not set
the module_unload_allowed
flag for this module. (Bug ID 23580850).