Sun Java System Access Manager 7.1 Performance Tuning and Troubleshooting Guide

Access Manager Server hangs during session failover

The problem occurs when both Access Manager server and JMQ (and even BDB) are installed in one machine. Both web server instances hang. A thread dump from the first web server instance shows all its threads are in socketRead operations.


waiting:

- java.net.SocketInputStream.socketRead0
(java.io.FileDescriptor, byte[], int, int, int) @bci=0, 
pc=0xf75e0274, methodOop=0xf33a7aa8 
(Compiled frAccess Managere; information may be imprecise)

A thread dump of the second web server instance shows the corresponding writePacketNoAck calls from jmsclient:


 
-com.sun.messaging.jmq.jmsclient.ProtocolHandler.writePacketNoAck
(com.sun.messaging.jmq.io.ReadWritePacket) @bci=7, line=235, 
pc=0xf79a56b4, methodOop=0xf3650320 (Compiled frAccess Managere; 
information may be imprecise)
-com.sun.messaging.jmq.jmsclient.ProtocolHandler.writeJMSMessage
(javax.jms.Message) @bci=565, line=1567, pc=0xf76bc190, 
methodOop=0xf36533c8 (Compiled frAccess Managere)
-com.sun.messaging.jmq.jmsclient.WriteChannel.sendWithFlowControl
(javax.jms.Message) @bci=10, line=123, pc=0xf7825278, 
methodOop=0xf3689e48 (Compiled frAccess Managere)
-com.sun.messaging.jmq.jmsclient.TopicPublisherImpl.publish
(javax.jms.Message) @bci=2, line=73, pc=0xf782ece0, 
methodOop=0xf36b2400 (Compiled frAccess Managere)
-com.iplanet.dpro.session.jmqdb.JMQSessionRepository.save
(com.iplanet.dpro.session.service.InternalSession) @bci=92, 
line=346, pc=0xf7775008, methodOop=0xf3604770 (Compiled frAccess Managere)
-com.iplanet.dpro.session.service.SessionService.saveForFailover
(com.iplanet.dpro.session.service.InternalSession) @bci=26, 
line=2485, pc=0xf7005c34, methodOop=0xf35da5e8 (Interpreted frAccess Managere)
- com.iplanet.dpro.session.service.InternalSession.updateForFailover() 
@bci=46, line=969, pc=0xf74a7c48, methodOop=0xf36c0420 
(Compiled frAccess Managere)

Under a heavy load, the Access Manager server web container process will use most of the machine's CPU resources. Then JMQ and/or BDB with Access Manager sessiondb will not have sufficient CPU resources to process incoming requests. The first Access Manager server instance's threads carrying requests cannot write to the second Access Manager server instance with JMQ in the back because of the lack of CPU resources. Also the first Access Manager server instance will have its threads built up because of the backlog on the second instance due to the lack of processing on the part of JMQ and/or BDB for updating the session table.

Solution: Install JMQ and BDB on their own boxes, separate from Access Manager server machine.