6.40 (SPARC Only) Cluster Recovery May Fail When Using NFSv3 For Server Pool File System

If an Oracle VM Server in a clustered server pool fails, any file locks created by the NFS server while the Oracle VM Server was online are enforced until the Oracle VM Server recovers and notifies the NFS server to release those locks. This can block other Oracle VM Servers in the cluster indefinitely. This behavior is the result of the stateless nature of NFSv3 and its file locking mechanism. The locks can be cleared in Solaris, using the clear_locks command. However, this problem is recurrent as long as you continue to use NFSv3 to host a cluster file system.

Since NFSv4 is a stateful protocol and its locking mechanism is built into the protocol, this problem does not occur if you use NFSv4 to host a cluster file system. Under NFSv4, the state associated with file locks is maintained at the server under a lease-based model, where locks are leased to a client and the client needs to renew the lease. If the client fails to renew the lease then the NFS server is free to give the lock to another client. Therefore, SPARC-based server pools that are configured to use clustering, should use an NFSv4 export to host the cluster file system.

Note that since x86 server pools use an OCFS2 file system, even if they are hosted on an NFS share, they are unaffected by this problem.

Bug 18997487