High load scenarios are recognizable by the following symptoms:
User requests do not succeed
The database gives multiple timeout and “transaction aborted” messages
Frequent “HIGH LOAD” warnings in the history file
Sporadic failures
If a high load problem is suspected, consider the following:
Frequently, all of these problems can be solved by making more CPU horsepower available.
All user operations (delete, insert, update) are logged in the tuple log and executed. There tuple log may fill up because:
Execution slows due to CPU or disk I/O contention
The mirror node is slow in receiving the log records, which can happen as a result of:
Network contention, so the log records do not reach the mirror node
CPU and disk contention at the mirror node, which keeps it from processing the received log records quickly enough (“log throw due to...” messages in the history files).
If the tuple log is out of space, the history files contain messages showing HIGH LOAD on the tuple log.
Check CPU usage, as described in Improving CPU Utilization
If CPU utilization is not a problem, check the disk I/O. If the disk shows contention, avoid page faults when log records are being processed by increasing the data buffer size with hadbm set DataBufferPoolSize=... If there is disk contention, follow the solutions suggested in Is There Disk Contention?
Look for evidence of network contention, and resolve bottlenecks.
Increase the tuple log buffer using hadbm set LogBufferSize=...
Too many node-internal operations are scheduled but not processed due to CPU or disk I/O problems.
If the node-internal log is out of space, the history files contain messages showing HIGH LOAD on the node internal log.
Check CPU usage, as described in Improving CPU Utilization
If CPU utilization is not a problem, check the disk I/O. If the disk shows contention, avoid page faults when log records are being processed by increasing the data buffer size with hadbm set DataBufferPoolSize=... If there is disk contention, follow the solutions suggested in Is There Disk Contention?
Some extra symptoms that identify this condition are:
Error code 2080 or 2096 delivered to the client.
hadbm resourceinfo --locks shows locks allocated, and all are in use all the time
A transaction running on a node is not allowed to use more than 25% of the number of locks allocated on that node. Read transactions running at the “repeatable read” isolation level and the update/insert/delete transactions hold the locks until the transaction terminates. Therefore, it is recommended to split long transactions into small batch of separate transactions.
Use hadbm set NumberOfLocks= to increase the number of locks.
In most situations, reducing load or increasing the availability of resources will improve host performance. Some of the more common steps to take are:
Run the nodes on hosts with better hardware characteristics (more internal memory, higher processor speed, more processors).
Add physical disks and use several data devices, not more than one device on each physical disk.
Add more nodes, on new hosts, and refragment the data to utilize the new nodes.
Change configuration variables to allocate larger memory segments or internal data structures.
In addition, the following resources can be adjusted to improve “HIGH LOAD” problems, as described in the Performance and Tuning Guide:
Table 3–1 HADB Performance Tuning Properties
Resource |
Property |
---|---|
Size of Database Buffer |
hadbm attribute DataBufferPoolSize |
Size of Tuple Log Buffer |
hadbm attribute LogBufferSize |
Size of Node Internal Log Buffer |
hadbm attribute InternalLogBufferSize |
Number of Database Locks |
hadbm attribute NumberOfLocks |