You can use the ibdiagnet command to determine which links are experiencing symbol errors and recovery errors by injecting packets.
# ibdiagnet -c 1000 -P all=1
In this instance of the ibdiagnet command, 1000 test packets are injected into each link. The -P all=1 option returns all Performance Monitor counters that increment during the test, respective to the GUID and port of the InfiniBand device.
That line contains the symbol error count in hexadecimal. The preceding lines identify the node and port with the errors. Symbol errors are minor errors, and if there are relatively few during the diagnostic, they can be monitored.
Note - According to the InfiniBand specification 10E-12 BER, the maximum allowable symbol error rate is 120 errors per hour.
That line contains the recovery error count in hexadecimal. The preceding lines identify the node and port with the errors. Recovery errors are major errors and the respective links must be investigated for the cause of the rapid symbol error propagation.
Note - Additionally, the ibdiagnet.log file contains the log of the testing.