26 Analyzing Reporter Content

Coherence provides out of the box information that helps administrators and developers better analyze usage and configuration issues that may occur.

26.1 Network Health

The Network Health report contains the primary aggregates for determining the health of the network communications. The network health file is a tab delimited file that is prefixed with the date in YYYYMMDD format and post fixed with -network-health.txt. For example 20090131-network-health.txt would be created on January 1, 2009. Table 26-1 describes the content of the Network Health report.

Table 26-1 Contents of the Network Health Report

Column Type Description

Batch Counter

Long

A sequential counter to help integrate information between related files. This value does reset when the reporter restarts and is not consistent across nodes. However, it is helpful when trying to integrate files.

Report Time

Date

The system time when the report executed.

Min Node Rx Success

Double

The minimum receiver success rate for a node in the cluster. If this value is considerably less (10%) than the Grid Rx Success rate. Further analysis using the Network Health Detail should be done.

Grid Rx Success

Double

The receiver success rate for the grid as a whole. If this value is below 90%. Further analysis of the network health detail should be done.

Min Node Tx Success

Double

The minimum publisher success rate for a node in the cluster. If this value is considerably less (10%) than the Grid Rx Success rate. Further analysis using the Network Health Detail should be done.

Grid TX Success

Double

The publisher success rate for the grid as a whole. If this value is below 90%. Further analysis of the network health detail should be done.


26.2 Network Health Detail

The Network Health report supporting node level details for determining the health of the network communications. The network health detail file is a tab delimited file that is prefixed with the date in YYYYMMDD format and post fixed with -network-health-detail.txt. For example 20090131-network-health.txt would be created on January 1, 2009. Table 26-2 describes the content of the Network Health Detail report.

Table 26-2 Contents of the Network Health Detail Report

Column Data Type Description

Batch Counter

Long

A sequential counter to help integrate information between related files. This value does reset when the reporter restarts and is not consistent across nodes. However, it is helpful when trying to integrate files.

Report Time

Date

The system time when the report executed.

Node Id

Long

The node for the network statistics.

Tx Success

Double

The publisher success rate for the node. If this value is within 2%-3% of the "Min Node Tx Success" and more than 10% less than the "Grid Tx Success" for the batch in the Network Health File, the corresponding node may be having difficulty communicating with the cluster. Constrained CPU, constrained network bandwidth or high network latency could cause this to occur.

RX Success

Double

The receiver success rate for the node. If this value is within 2%-3% of the "Min Node Rx Success" and more than 10% less than the "Grid Tx Success" for the batch in the Network Health File, the corresponding node may be having difficulty communicating with the cluster. Constrained CPU, constrained network bandwidth or high network latency could cause this to occur.

PacketsSent

Double

The total number of network packets sent by the node.

Current Packets Sent

Long

The number of packets sent by the node since the prior execution of the report.

PacketsResent

Long

The total number of network packets resent by the node. Packets will be resent when the receiver of the packet receives and invalid packet or when an acknowledge packet is not sent within the appropriate amount of time.

Current Packet Resent

Long

The number of network packets resent by the node since the prior execution of the report.

PacketsRepeated

Long

The total number of packets received more than once.

Current Packets Repeated

Long

The number of packets received since the last execution of the report.

PacketsReceived

Long

The total number of packets received by the node.

Current Packets Received

Long

The total number of packets received by the node since the last execution of the report.


26.3 Memory Status

The Memory Status report must be run as part of a report batch. The values are helpful in understanding memory consumption on each node and across the grid. For data to be included nodes must be configured to publish platform MBean information. The memory status file is a tab delimited file that is prefixed with the date in YYYYMMDD format and post fixed with -memory-status.txt. For example 20090131-memory-status.txt would be created on January 1, 2009. Table 26-3 describes the content of the Memory Status report.

Table 26-3 Contents of the Memory Status Report

Column Data Type Description

Batch Counter

Long

A sequential counter to help integrate information between related files. This value does reset when the reporter restarts and is not consistent across nodes. However, it is helpful when trying to integrate files.

Report Time

Date

The system time when the report executed.

Node Id

Long

The node for the memory statistics.

Gc Name

String

The name of the Garbage Collector information.

CollectionCount

Long

The number of garbage collections that have happened since the virtual machine started.

Delta Collection Count

Long

The number of garbage collections that have occurred since the last execution of the report.

CollectTime

Long

The number of milliseconds the JVM has spent on garbage collection since the start of the JVM.

Delta Collect Time

Long

The number of milliseconds the JVM has spent on garbage collection since the last execution of the report.

Last GC Start Time

Long

The start time of the last Garbage Collection.

Last GC Stop Time

Long

The stop time of the last garbage collection.

Heap Committed

Long

The number of heap bytes committed at the time of report.

Heap Init

Long

The number of heap bytes initialized at the time of the report.

Heap Max

Long

The Maximum number of bytes used by the JVM since the start of the JVM.

Heap Used

Long

The bytes used by the JVM at the time of the report.


26.4 Cache Size

The cache size report can be executed either on demand or it can be added as part of the report batch and the Caches should have the <unit-calculator> subelement of <local-scheme> set to BINARY. The cache size file is a tab delimited file that is prefixed with the date in YYYYMMDD format and post fixed with -cache-size.txt. For example 20090131-cache-size.txt would be created on January 1, 2009. Table 26-4 describes the content of the Cache Size report.

Table 26-4 Contents of the Cache Size Report

Column Data Type Description

Batch Counter

Long

A sequential counter to help integrate information between related files. This value does reset when the reporter restarts and is not consistent across nodes. However, it is helpful when trying to integrate files.

Report Time

Date

The system time when the report executed.

Cache Name

String

The name of the cache.

MemoryMB

Double

The MB consumed by the objects in the cache. This does not include indexes or over head.

Avg Object Size

Double

The Average memory consumed by each object.

Cache Size

Double

The number of objects in the cache.

Memory Bytes

Double

The number of bytes consumed by the objects in the cache. This does not include indexes or over head.


26.5 Service Report

The service report provides information to the requests processed, request failures, and request backlog, tasks processed, task failures and task backlog. Request Count and Task Count are useful to determine performance and throughput of the service. RequestPendingCount and Task Backlog are useful in determining capacity issues or blocked processes. Task Hung Count, Task Timeout Count, Thread Abandoned Count, Request Timeout Count are the number of unsuccessful executions that have occurred in the system. Table 26-5 describes the contents of the Service report.

Table 26-5 Contents of the Service Report

Column Data Type Description

Batch Counter

Long

A sequential counter to help integrate information between related files. This value does reset when the reporter restarts and is not consistent across nodes. However, it is helpful when trying to integrate files.

Report Time

Date

The system time when the report executed.

Service

String

The service name.

Node Id

String

The numeric node identifier.

Refresh Time

Date

The system time when the service information was updated from a remote node.

Request Count

Long

The number of requests since the last report execution.

RequestPendingCount

Long

The number of pending requests at the time of the report.

RequestPendingDuration

Long

The duration for the pending requests at the time of the report.

Request Timeout Count

Long

The number of request timeouts since the last report execution.

Task Count

Long

The number of tasks executed since the last report execution.

Task Backlog

Long

The task backlog at the time of the report execution.

Task Timeout Count

Long

The number of task timeouts since the last report execution.

Task Hung Count

Long

The number of tasks that hung since the last report execution.

Thread Abandoned Count

Long

The number of threads abandoned since the last report execution.


26.6 Node List

Due to the transient nature of the node identifier (nodeId), the reporter logs out a list of nodes and the user defined <member-identity> information. The node list file is a tab delimited file that is prefixed with the date in YYYYMMDD format and post fixed with -nodes.txt. For example 20090131-nodes.txt would be created on January 1, 2009. Table 26-6 describes the content of the Node List report.

Table 26-6 Contents of the Node List Report

Column Data Type Description

Batch Counter

Long

A sequential counter to help integrate information between related files. This value does reset when the reporter restarts and is not consistent across nodes. However, it is helpful when trying to integrate files.

Report Time

Date

The system time when the report executed.

Node Id

String

The numeric node identifier.

UnicastAddress

String

The Unicast address for the node.

MemberName

String

The member name for the node.

ProcessName

String

The process name for the node.

RoleName

String

The role name for the node.

MachineName

String

The machine name for the node.

RackName

String

The rack name for the node.

SiteName

String

The site name for the node.

Refresh Time

Date/Time

The time which the information was refreshed from a remote node. If the time is not the same as the refresh time on other rows in the batch, the node did not respond in a timely matter. This is often caused by a node preforming a garbage collection. Any information regarding a node with an "old" refresh date is questionable.


26.7 Proxy Report

The proxy file provides information about proxy servers and the information being transferred to clients. The Proxy file is a tab delimited file that is prefixed with the date in YYYYMMDD format and post fixed with -report-proxy.txt. For example 20090131-report-proxy.txt would be created on January 1, 2009. Table 26-7 describes the content of the Proxy report.

Table 26-7 Contents of the Proxy Report

Column Type Description

Batch Counter

Long

A sequential counter to help integrate information between related files. This value does reset when the reporter restarts and is not consistent across nodes. However, it is helpful when trying to integrate files.

Report Time

Date

The system time when the report executed.

Node Id

String

The numeric node identifier.

Service Name

String

The name of the proxy service.

HostIp

String

The IP Address and Port of the proxy service.

ConnectionCount

Long

The current number of connections to the proxy service.

OutgoingByteBacklog

Long

The number of bytes queued to be sent by the proxy service.

OutgoingMessageBacklog

Long

The number of messages queued by the proxy service.

Bytes Sent

Long

The number of bytes sent by the proxy service since the last execution of the report.

Bytes Received

Long

The number of bytes received by the proxy service since the last execution of the report.

Messages Sent

Long

The number of messages sent by the proxy service since the last execution of the report.

Messages Received

Long

The number of messages received by the proxy service since the last execution of the report.