Storage Services Issues
This section describes known issues and workarounds related to the functionality of the internal ZFS storage appliance and the different storage services: block volume storage, object storage and file system storage.
Updating Terraform Changes File Storage Export Path
When you use Terraform to create a file system export, you must specify
AUTOSELECT
for the value of path
in the
oci_file_storage_export
definition.
You must also include the lifecycle
stanza to ignore any updates to the
path. If you do not ignore updates to the path, the path is automatically deleted and
re-created when you update the Terraform, even if you do not explicitly update this path.
Updating this path can interrupt clients that have an active mount via the export.
Workaround: Set the path and include the lifecycle
stanza as shown in the following example:
resource "oci_file_storage_export" "pcauserExport" { export_set_id = local.Okit_MT_1702774958525ExportSet_id file_system_id = local.Okit_FS_1702774481898_id path = "AUTOSELECT" lifecycle { ignore_changes = [ path, ] } }
Bug: 36116003
Version: 3.0.2
Creating Image from Instance Takes a Long Time
When you create a new compute image from an instance, its boot volume goes through a series of copy and conversion operations. In addition, the virtual disk copy is non-sparse, which means the full disk size is copied bit-for-bit. As a result, image creation time increases considerably with the size of the base instance's boot volume.
Workaround: Wait for the image creation job to complete. Check the work request status in the Compute Web UI, or use the work request id to check its status in the CLI.
Bug: 33392755
Version: 3.0.1
Large Object Transfers Fail After ZFS Controller Failover
If a ZFS controller failover or failback occurs while a large file is uploaded to or downloaded from an object storage bucket, the connection may be aborted, causing the data transfer to fail. Multipart uploads are affected in the same way. The issue occurs when you use a version of the OCI CLI that does not provide the retry function in case of a brief storage connection timeout. The retry functionality is available as of version 3.0.
Workaround: For a more reliable transfer of large objects and multipart uploads, use OCI CLI version 3.0 or newer.
Bug: 33472317
Version: 3.0.1
Use Multipart Upload for Objects Larger than 100MiB
Uploading very large files to object storage is susceptible to connection and performance issues. For maximum reliability of file transfers to object storage, use multipart uploads.
Workaround: Transfer files larger than 100MiB to object storage using multipart uploads. This behavior is expected; it is not considered a bug.
Bug: 33617535
Version: n/a
File System Export Temporarily Inaccessible After Large Export Options Update
When you update a file system export to add a large number of
'source'-type export options, the command returns a service error that
suggests the export no longer exists ("code": "NotFound"
). In actual fact,
the export becomes inaccessible until the configuration update has completed. If you try to
access the export or display its stored information, a similar error is displayed. This behavior
is caused by the method used to update file system export options: the existing configuration is
deleted and replaced with a new one containing the requested changes. It is only noticeable in
the rare use case when dozens of export options are added at the same time.
Workaround: Wait for the update to complete and the file
system export to become available again. The CLI command oci fs export get --export-id
<fs_export_ocid>
should return the information for
the export in question.
Bug: 33741386
Version: 3.0.1
Block Volume Stuck in Detaching State
Block volumes can be attached to several different compute instances, and can even have multiple attachments to the same instance. When simultaneous volume detach operations of the same volume occur, as is done with automation tools, the processes may interfere with each other. For example, different work requests may try to update resources on the ZFS storage appliance simultaneously, resulting in stale data in a work request, or in resource update conflicts on the appliance. When block volume detach operations fail in this manner, the block volume attachments in question may become stuck in detaching state, even though the block volumes have been detached from the instances at this stage.
Workaround: If you have instances with block volumes stuck in detaching state, the volumes have been detached, but further manual cleanup is required. The detaching state cannot be cleared, but the affected instances can be stopped and the block volumes can be deleted if that is the end goal.
Bug: 33750513
Version: 3.0.1
Fix available: Please apply the latest patches to your system.
Detaching Volume Using Terraform Fails Due To Timeout
When you use Terraform to detach a volume from an instance, the operation may fail with an error message indicating the volume attachment was not destroyed and the volume remains in attached state. This can occur when the storage service does not send confirmation that the volume was detached, before Terraform stops polling the state of the volume attachment. The volume may be detached successfully after Terraform has reported an error.
Workaround: Re-apply the Terraform configuration. If the errors were the result of a timeout, then the second run will be successful.
Bug: 35256335
Version: 3.0.2
Creating File System Export Fails Due To Timeout
At a time when many file system operations are executed in parallel, timing becomes a critical factor and could lead to an occasional failure. More specifically, the creation of a file system export could time out because the file system is unavailable. The error returned in that case is: "Internal Server Error: No such filesystem to create the export on".
Workaround: Because this error is caused by a resource locking and timeout issue, it is expected that the operation will succeed when you try to execute it again. This error only occurs in rare cases.
Bug: 34778669
Version: 3.0.2
File System Access Lost When Another Export for Subset IP Range Is Deleted
A virtual cloud network (VCN) can contain only one file system mount target. All file systems made available to instances connected to the VCN must have exports defined within its mount target. File system exports can provide access to different file systems from overlapping subnets or IP address ranges. For example: filesys01 can be made available to IP range 10.25.4.0/23 and filesys02 to IP range 10.25.5.0/24. The latter IP range is a subset of the former. Due to the way the mount IP address is assigned, when you delete the export for filesys02, access to filesys01 is removed for the superset IP range as well.
Workaround: If your file system exports have overlapping source IP address ranges, and deleting one export causes access issues with another export similar to the example above, then it is recommended to delete the affected exports and create them again within the VCN mount target.
Bug: 33601987
Version: 3.0.2
File System Export UID/GID Cannot Be Modified
When creating a file system export you can add extra NFS export options, such as access
privileges for source IP addresses and identity squashing. Once you have set a user/group
identity (UID/GID) squash value in the NFS export options, you can no longer modify that
value. When you attempt to set a different ID, an error is returned: "Uid and Gid are
not consistent with FS AnonId: <currentUID>
"
Workaround: If you need to change the UID/GID mapping, delete the
NFS export options and recreate them with the desired values. If you are using the OCI CLI, you must delete the entire file system
export (not just the options) and recreate the export, specifying the desired values with the
--export-options
parameter.
Bug: 34877118
Version: 3.0.2
Block Volume Performance Level Not Preserved During Cloning
The block volumes provisioned on the ZFS Storage Appliance are
located in either the standard or high-performance pool. The performance level is reflected in
the properties of each block volume as volume performance units (VPU) per GB. However,
when cloning a volume group or volume group backup, the performance level of all new block
volumes produced by the clone operation is set to 0. CLI output will show the parameter
"vpus-per-gb": 0
in the properties of the block volume clone.
Workaround: There is no workaround available. The block volume clones are placed in the correct storage pool, meaning their performance level is as intended.
Bug: 35333587
Version: 3.0.2
Internal Backups for Instance Cloning Not Displayed
When you clone a compute instance, an internal backup of the boot and block volumes is created. In appliance software versions up to 3.0.2-b852928 those internal backups are visible to users. While not recommended, the backups could technically be used to create additional instances. Existing internal backups are not deleted during appliance upgrade or patching. However, in newer software versions the internal backups are no longer exposed.
Workaround: Do not create clones or new compute instances from the existing internal volume (group) backups. To remove old backups of storage volumes, ensure that all other backups and clones of the original source volume are terminated first.
Bug: 35406033
Version: 3.0.2
Limit for Volume Backups Not Enforced
The "Service Limits" chapter in the Oracle Private Cloud Appliance Release Notes specifies a limit of 100 volume backups per tenancy for a system with default storage capacity. This limit is not enforced: you can continue to create volume backups beyond the documented maximum.
Workaround: In theory, the maximum number of volume backups is limited by available storage on the ZFS Storage Appliance. The system is expected to handle thousands of volume backups across all tenancies. However, we recommend that an administrator monitors storage space consumption proactively if users create many volume backups.
Bug: 35509673
Version: 3.0.2
NFS Service Interruption During ZFS Storage Appliance Firmware Upgrade or Patching
When the firmware of the appliance's ZFS Storage Appliance is upgraded or patched, compute instances could encounter an interruption of NFS connectivity. The service outage occurs when failover/failback is performed between the storage appliance controllers, and it could take over 2 minutes to reestablish the NFS service. There could be multiple factors contributing to the delay: the NFS server's 90 second grace period to allow NFSv4 clients to recover locking state after an outage, the NFS protocol attempting to reconnect to the same TCP port, and the NFS client's kernel version.
Workaround: To reduce the outage time of NFS connectivity, it is recommended to use the mount options described in the note with Doc ID 359515.1. While the document describes optimizations for Oracle RAC and Oracle Clusterware, the mount options also improve NFS performance and stability in a Private Cloud Appliance environment.
Bug: 36348165
Version: 3.0.2