2.1.7 Improved RoCE Network Resilience

Oracle Exadata relies on the high-performance RoCE Network Fabric that connects all the database and storage servers. Occasionally, administrator errors or network firmware issues can interfere with the RoCE Network Fabric, resulting in seemingly functional network links that cannot support network traffic. If left unattended, such cases can cause various failures, including database instance failures and cluster outages.

Oracle Exadata System Software release 24.1.0 contains an automatic monitoring function that constantly checks the RoCE Network Fabric. If network traffic stalls without any apparent link error, the associated IP address automatically migrates to the partner link. Later, when the stalled link resumes functioning, the IP address automatically returns to the working link.

This capability improves the resilience of the RoCE Network Fabric and automatically guards against a variety of potential failures caused by issues in the RoCE Network Fabric.