在 Solaris 10 Opteron 上,使用 hadbm 命令启动、停止或重新配置 HADB 可能会失败或挂起,并产生以下错误消息:
hadbm:Error 22009: The command issued had no progress in the last 300 seconds. HADB-E-21070: The operation did not complete within the time limit, but has not been cancelled and may complete at a later time. |
如果 clu_noman_srv 进程所使用的文件 (nomandevice) 存在不一致的读/写操作,就可能出现这种情况。通过在 HADB 历史文件中查找以下消息,可以检测到此问题:
n:3 NSUP INF 2005-02-11 18:00:33.844 p:731 Child process noman3 733 does not respond. n:3 NSUP INF 2005-02-11 18:00:33.844 p:731 Have not heard from it in 104.537454 sec. n:3 NSUP INF 2005-02-11 18:00:33.844 p:731 Child process noman3 733 did not start. |
以下解决方法未经验证,原因是此问题尚未手动再现。但是,对受影响的节点运行此命令应该能解决此问题。
hadbm restartnode --level=clear nodeno dbname |
请注意,该节点的所有设备都将重新初始化。在重新初始化之前可能必须停止该节点。