![]() | |
Sun Java System Application Server Standard and Enterprise Edition 7 2004Q2 Administration Guide |
Chapter 21
Administering the High-Availability Database (Enterprise Edition)This chapter describes the high-availability database (HADB) in the Sun Java System Application Server Enterprise Edition 7 environment and explains how to configure and administer the HADB. The tasks described in this chapter fit into the overall steps to set up the HADB, which are summarized in the following table.
Table 21-1 HADB Roadmap
Step
Description of Task
Location of Instructions
1
Decide on your high-availability topology and set up your systems
Sun Java System Application Server System Deployment Guide
2
Install the HADB software with or without the Sun Java System Application Server software
Sun Java System Application Server Installation Guide
3
Set up the following for HADB machines:
Sun Java System Application Server Installation Guide
4
Create and start the HADB*
Administer the HADB
Current chapter
5
Set up session persistence and the session persistence store*
Chapter 20, "Configuring Session Persistence (Enterprise Edition)"
6
Tune the HADB for maximum performance
Sun Java System Application Server Performance Tuning Guide
* Can be done as part of cluster setup using the clsetup command. See the Sun Java System Application Server Installation Guide.
Note
If you have not configured rsh or ssh, the hadbm commands described in this chapter will not work. For details about rsh or ssh configuration, see the Sun Java System Application Server Installation Guide and Setting the Management Protocol.
This chapter contains the following sections:
About the High-Availability DatabaseThis section introduces you to the high-availability database (HADB), which you can use to store persistent HTTP session and stateful session bean (SFSB) session information. This section includes these topics:
HADB Architecture
High-availability means availability despite planned outages for upgrades or unplanned outages caused by hardware or software failures. The HADB is based on a simple data model and the Always-On technology. The HADB offers an ideal platform for delivering all types of session state persistence within a high performance enterprise application server environment.
The following figure shows the architecture of a database with four active nodes and two spare nodes. Nodes 0 and 1 are a mirror node pair, as are nodes 2 and 3.
Figure 21-1 HADB Architecture
![]()
The HADB achieves high data availability through fragmentation and replication of data. All tables in the database are partitioned to create subsets of approximately the same size called fragments. This process of fragmentation is based on a hash function. This hash function fragments and evenly distributes the data among the database’s nodes. Each fragment is stored twice in the database, in mirror nodes. This ensures fault tolerance and fast recovery of data. In addition, if a node fails or is shut down, a spare node can take over until the node is active again.
HADB nodes are organized into two Data Redundancy Units (DRUs), which mirror each other. Each DRU consists of half of the active and spare nodes, and contains one complete copy of the data. To ensure fault tolerance, the computers that support one DRU must be completely self-supported with respect to power (use of uninterruptible power supplies is recommended), processing units, and storage. If a power failure occurs in one DRU, the nodes in the other DRU can continue servicing requests until the power returns.
Without a session persistence mechanism, the HTTP or SFSB session state, including the passivated session state, is lost when one web or EJB container fails over to another. Use of the HADB for session persistence overcomes this situation. The HADB stores and retrieves state information in a separate but well-integrated persistent storage tier.
The HADB reclaims space when session data is deleted. The HADB places session data records in fixed size blocks. When all records of a block are deleted, the block is freed. Records of a block can be deleted randomly, creating “holes” in the block. When a new record is to be inserted into a block and contiguous space is needed, the holes are removed and thus the block is compacted.
This is a brief summary of the architecture. For details, see the Sun Java System Application Server System Deployment Guide.
HADB Nodes
A database node consists of a set of processes, a dedicated area of shared memory, and one or more secondary storage devices. It is used for storing and updating session data. Each node must have a mirror node, therefore nodes occur in pairs. In addition, to maximize availability, you should include two or more spare nodes, one in each DRU, so if a node fails a spare can take over while the node is repaired.
For an explanation of node topology alternatives, see the Sun Java System Application Server System Deployment Guide. For more general information about nodes and how to monitor them, see Node Status.
The hadbm Command
Note
If you have not configured rsh or ssh, the hadbm command will not work. For details about rsh or ssh configuration, see the Sun Java System Application Server Installation Guide and Setting the Management Protocol.
The hadbm command is the management client for the HADB database. By default, there is one management client located in the install_dir/SUNWhadb/4/bin directory. For information about creating additional management clients, see Creating Other Management Clients.
The commands you use to administer HADB are subcommands of the hadbm command. The general syntax is as follows:
hadbm subcommand [-short-form [argument]]* [--long-form [argument]]* [dbname]
For example, the following is one use of the hadbm status subcommand, which lets you check the status of HADB:
hadbm status --nodes
The subcommand identifies the operation or task you wish to perform. Subcommands are case-sensitive.
Options are also case-sensitive. Each option has a long form and a short form. Short forms have a single dash (-); while long forms have two dashes (--). Options modify how hadbm performs a subcommand. Most options require argument values, except for boolean options, which must be present to switch a feature on. Optional options are enclosed in square brackets [ ] in syntax lines.
For subcommands that take a database name, if a database is not identified, the default database is used. The default database is hadb, all lowercase.
You can set the password for a subcommand from a file instead of entering the password at the command line. The --dbpasswordfile option takes the file containing the passwords. The valid contents for the file are:
HADBM_DBPASSWORD=password
The rest of the contents of the file are ignored. If both the --dbpassword and --dbpasswordfile options are specified, the --dbpassword takes precedence. If a password is required, but is not specified in the command, you are prompted for a password.
The hadbm general command options, which you can use with any hadbm subcommand, are listed in the following table. All are booleans that are not present by default.
Creating Other Management Clients
By default, the hadbm command must be executed from the machine from which hadbm create was executed, or from an HADB machine, by the user who executed hadbm create (the HADB system user). There are some scenarios for which this limitation is not desirable. It may be useful to give more than one user access to the database. The machine from which hadbm create was executed may be unavailable due to a hardware failure.
When the HADB system user executes hadbm create, the name of the database and the location of its configuration files are recorded by the management client in a file named .cladmrc in the HADB system user’s home directory.
To identify a database, all hadbm commands except create, list, help, and version search for the database configuration files in these locations, in the following order:
If the configuration files cannot be located, hadbm cannot execute properly. The hadbm list command does not list the database, and you receive the following error:
hadbm:Error 22002: Specified database does not exist. Use "hadbm list" to get a list of existing databases: [ hadb ]
You can set up an hadbm management client on a non-HADB machine other than the machine from which hadbm create was executed by ensuring that the new management client can find the database configuration files.
To set up a new hadbm management client:
- Make sure that the HADB software is installed on the new management client machine. See the Sun Java System Application Server Installation Guide.
- Copy the .cladmrc file from the HADB system user’s home directory to the home directory of the user of the new management client machine.
- Check whether the configuration files are accessible from the new management client machine. Entries in the .cladmrc file have the following format:
dbname:configpath:howtoaccess
The howtoaccess value can be NFSMNT for the network file system, or a host name for a path that can only be accessed locally. For example:
hadb:/etc/opt/SUNWhadb/dbdef:NFSMNT
hadb:/dsk0/dbdef:host0If the howtoaccess value of the database is NFSMNT and the configpath/dbname/cfg file can be read from the new management client machine, (more...) is unnecessary.
- If the configuration files are not accessible from the new management client machine:
- Create a directory on the new management client machine with the same path as the configpath. For example, this new directory could be /etc/opt/SUNWhadb/dbdef.
- Create a directory called dbname (for example hadb) under configpath. Copy into this new directory all the files from configpath/dbname on an HADB machine or the machine from which hadbm create was executed.
- Change the howtoaccess value of the database’s entry in the .cladmrc file to the host name of the new management client machine.
- Make sure the new management client machine can communicate with the HADB machines over the network through the ManagementProtocol (ssh or rsh) that was specified in the hadbm create command.
Configuring the HADBThis section describes the following basic HADB configuration tasks:
Setting the Management Protocol
Before you create the database, you need to determine whether the hadbm command uses rsh or ssh for remote execution. Because ssh is the default, you only need to set the management protocol if you need to use rsh instead. Use --set option of the hadbm create command as in the following example:
hadbm create --set ManagementProtocol=rsh --spares 2 --devicesize 1024 --dbpassword secret123 --hosts n0,n1,n2,n3,n4,n5
The hadbm command uses the rsh or ssh protocol to communicate with the database nodes. The management protocol is not used by HADB nodes for communication.
Creating a Database
The clsetup command creates an HADB database as part of cluster initialization and setup. This is the recommended way of creating a database. For details about clsetup, see the Sun Java System Application Server Installation Guide.
However, you can create a database outside of clsetup. To manually create a database to store the session data, use the hadbm create command. The syntax is as follows.
hadbm create [--installpath=path] [--historypath=path] [--devicepath=path] [--configpath=path] [--datadevices=devices-per-node] [--portbase=base-no] [--spares=sparecount] [--set=attr-name-value-list] [--inetd] [--inetdsetupdir=path] --devicesize=size --dbpassword=password | --dbpasswordfile=file --hosts=node-list [dbname]
For example:
hadbm create --spares 2 --devicesize 1024 --dbpassword secret123 --hosts n0,n1,n2,n3,n4,n5
The hadbm create command also starts the database if the --inetd and --inetdsetupdir options are not used.
If you have difficulty creating a database, check the following:
- Sun Java System Application Server and HADB port assignments must not conflict with other port assignments on the same machine. Default and recommended port assignments are as follows:
- Sun Java System Message Queue: 7676
- IIOP: 3700
- HTTP server (UNIX root or Windows): 80
- HTTP server (UNIX non-root): 1024
- Admin server (UNIX root or Windows): 4848
- HADB nodes: Each node uses six consecutive ports. If the default portbase (15200) is used, node 0 uses 15200 through 15205, node 1 uses 15220 through 15225, and so on.
- Disk space must be adequate; see the Sun Java System Application Server Installation Guide.
Database creation errors are written to the following files:
- The Sun Java System Application Server log file. See Chapter 5, "Using Logging."
- The HADB history file. See Clearing and Archiving History Files.
- The operating system’s syslog files if the SysLogging configuration attribute is set to TRUE, the default. See Configuration Attributes.
If you still have difficulty creating a database, contact Sun customer support. See Using Sun Customer Support for the HADB.
The hadbm create command options are listed in the following table.
Table 21-3 hadbm create Options
Long Form
Short Form
Default
Description
--installpath
-n
parent of the directory where hadbm resides: install_dir/SUNWhadb/4/
Specifies the HADB system installation path. This path must already exist and be writable. Use this option if the HADB server installation resides in a location different from the management-client machine from which the hadbm create command is run.
--historypath
-t
/var/tmp
Specifies the path to the history files. This path must already exist and be writable. For details about history files, see Clearing and Archiving History Files.
When database creation fails, the history files are removed from the HADB machines, and valuable debugging information is lost. However, if you create a writable directory with the same path as the --historypath on the machine from which you execute the hadbm create command, and this machine is different from the HADB machines, the history files are saved there.
--devicepath
-d
/var/opt/SUNWhadb
Specifies the path to the devices. There are three devices: the DataDevice, the NiLogDevice (node internal log device), and the RelalgDevice (relational algebra query device). This path must already exist and be writable. To set this path differently for each node or each device, see Setting Heterogeneous Device Paths.
--configpath
-c
/etc/opt/SUNWhadb
Specifies the path to the configuration files used internally by the HADB. This path must already exist and be writable.
--datadevices
-a
1
Specifies the number of data devices on each node, between 1 and 8 inclusive. Data devices are numbered starting at 0.
--portbase
-b
15200
Specifies the port base number used for node 0. Successive nodes are automatically assigned port base numbers in steps of 20 from this number. Each node uses its port base number and the next five consecutively numbered ports.
If you want to run several databases on the same machine, you should have a plan for allocating port numbers and allocate them explicitly.
--spares
-s
0
Specifies the number of spare nodes. This number must be even and must be less than the number of nodes specified in the --hosts option. Spare nodes are optional, but having two or more ensures high availability.
--set
-S
none
Specifies a comma-separated list of database configuration attributes in name=value format. For explanations of valid database configuration attributes, see Viewing and Modifying Configuration Attributes.
For example, to specify the use of rsh instead of ssh (the default), use the following option:
--set ManagementProtocol=rsh
To use --set to set the --devicepath differently for each node or each device, see Setting Heterogeneous Device Paths.
--inetd
-I
not specified
If specified, the database is configured to run with the inetd daemon, and is not automatically started after it is created. See Additional Steps for inetd.
--inetdsetupdir
-u
current directory
Specifies the directory in which to store the inetd setup files. The directory must exist and be writable.
--devicesize
-z
none
Specifies the size of each device in MB. The device size should be as large as possible. The recommended size is four times the expected size of the user data, based on the number of users and the size of each user record.
The maximum size is the maximum operating system file size or 256 GB, whichever is smaller. The minimum size is as follows:
(4 x LogbufferSize + 16MB) / --datadevices
You can increase the device size later as described in Adding Storage Space to Existing Nodes.
For more information on setting the LogbufferSize, see Viewing and Modifying Configuration Attributes.
--dbpassword
-p
none
Creates a password for the HADB system user. Must be at least 8 characters. You can use --dbpasswordfile instead. For details, see The hadbm Command.
--dbpasswordfile
-P
none
Specifies a file that stores the password to be created for the HADB system user. For details, see The hadbm Command.
--hosts
-H
none
Specifies a comma-separated list of host names or IP addresses for the nodes in the database. Using IP addresses is recommended because there is no dependence on DNS lookups. Host names must be absolute. You cannot use localhost or 127.0.0.1 as a host name.
One node is created for each comma-separated item in the list. The number of nodes must be even. Using duplicate host names creates multiple nodes on the same machine with different port numbers. Make sure that nodes on the same machine are not mirror nodes.
Nodes are numbered starting at 0 in the order listed in this option. The first mirrored pair are nodes 0 and 1, the second 2 and 3, and so on. Odd numbered nodes are in one DRU, even numbered nodes in the other. If --spares is used, spare nodes are those with the highest numbers.
For information about configuring double network interfaces, see Configuring Double Networks.
dbname
none
hadb
Specifies the database name, which must be unique. To make sure the database name is unique, use the hadbm list command to list existing database names.
Use the default database name unless you need to create multiple databases. For example, to create multiple clusters with independent databases on the same set of HADB machines, use a separate database name for each cluster.
Setting Heterogeneous Device Paths
You can use the --set option of hadbm create to set the --devicepath differently for each node or each device. There are three types of devices: the DataDevice, the NiLogDevice (node internal log device), and the RelalgDevice (relational algebra query device). The syntax for each name=value pair is as follows, where -devno is required only if the device is DataDevice:
Node-nodeno.device-devno.Path=path
For example:
--set Node-0.DataDevice-0.Path=/disk0,Node-1.DataDevice-0.Path=/disk1
Any device path that is not set for a particular node or device defaults to the --devicepath value. You cannot change device paths using the hadbm set or hadbm addnodes commands.
Configuring Double Networks
To allow the HADB to tolerate single network failures, you can equip each HADB machine with two NIC cards. For each machine, the IP addresses of each of the NIC cards must be on separate IP subnets.
During database creation, you specify two IP addresses or host names for each node, one for each NIC card IP address, using the --hosts option. For each node, the first IP address is on net-0 and the second on net-1. The syntax is as follows, with host names for the same node separated by a plus sign (+):
--hosts=node0net0name+node0net1name,node1net0name+node1net1name,node2net0name+node2net1name, ...
For example, the following creates two nodes, each with two host names. The host names for node 0 are n0a and n0b, and the host names for node 1 are n1a and n1b. The n0a and n1a hosts are on net-0, and the n0b and n1b hosts are on net-1.
--hosts n0a+n0b,n1a+n1b
Within a database, all nodes must have one host name, or all nodes must have two host names. A database cannot have a mixture of nodes with double host names and single host names.
Additional Steps for inetd
If you did not specify --inetd when you created the database, the database is initialized and started from the --installpath location, which you can verify using the hadbm status command.
If you specified --inetd, the database exists, as shown by the hadbm list command, but is not running, as shown by hadbm status command.
Using inetd in production environments is recommended, but it is usually not needed in development or test environments. Creating a database using the --inetd option allows you to use inetd to automatically restart HADB nodes on a machine if the machine reboots. This allows for a more robust deployment, but it has its drawbacks in terms of administration.
In particular, if you want to stop a node, for example to perform some kind of maintenance, you must perform the following tasks:
Also, if you add nodes, you must take extra steps to update the inetd configuration files to take into account the new nodes.
To set up inetd for a new database or an existing database with new nodes, follow these steps:
- Stub files for inetd support are created in the directory specified in the --inetdsetupdir option and are named as follows:
dbname.hostname.inetd.conf
dbname.hostname.services
All stub files are placed locally on the machine where the hadbm create command is typed. Host names are further distinguished by numbers appended to the hostname.
Append the contents of these files to the /etc/inetd.conf and /etc/services files for each node’s machine, then reconfigure inetd as follows:
kill -HUP inetd-process-id
- After inetd has been reconfigured on each machine, you are ready to initialize and start your database using the hadbm clear command. See Clearing the HADB. This step is necessary for a new database, but not for an existing database with new nodes.
- You can verify that inetd is working by stopping a node and checking to make sure it restarts. See Stopping a Node and Getting the Status of the HADB.
Setting Up the JDBC Connection Pool
The Sun Java System Application Server communicates with the HADB in the same way that it communicates with relational databases used for data storage, therefore you need to set up a JDBC connection pool for the HADB as you would for any other database.
Using the clsetup command is recommended for configuring a JDBC connection pool and JDBC resource for the HADB in the Sun Java System Application Server. This command is described in the Sun Java System Application Server Installation Guide.
Manual configuration of a JDBC connection pool and JDBC resource for the HADB is briefly summarized in these sections:
For general information about connection pools and JDBC resources, see About JDBC Resources.
Getting the JDBC URL
Before you can set up the JDBC connection pool, you need to determine the JDBC URL of the HADB using the hadbm get command as follows:
hadbm get JdbcUrl [dbname]
For example:
hadbm get JdbcUrl
The JDBC URL is displayed on the standard output device in the following form:
jdbc:sun:hadb:host:port,host:port,...
Remove the jdbc:sun:hadb: prefix and use the host:port,host:port... part as the value of the serverList connection pool property, described in the next section.
Creating a Connection Pool
The following table summarizes connection pool settings required for the HADB. Change Steady Pool Size when adding nodes, but do not change other settings.
The following table summarizes connection pool properties required for the HADB. Change serverList when adding nodes, but do not change other properties.
Table 21-5 HADB Connection Pool Properties
Property
Description
username
Specifies the name of the storeuser to be specified in the asadmin create-session-store command. See Creating the Session Store.
password
Specifies the storepassword to be specified in the asadmin create-session-store command. See Creating the Session Store.
serverList
Specifies the JDBC URL of the HADB. To determine this value, see Getting the JDBC URL.
You must change this value if you add nodes to the database. See Adding Nodes to the HADB.
cacheDatabaseMetaData
Setting this property to false as required ensures that calls to Connection.getMetaData() make calls to the database, which ensures that the connection is valid.
eliminateRedundantEndTransaction
Setting this property to true as required improves performance by eliminating redundant commit and rollback requests and ignoring these requests if no transaction is open.
maxStatement
Specifies the maximum number of statements per open connection that are cached in the driver statement pool. Set this property to 20.
Connection Pool Example
Here is an example asadmin create-jdbc-connection-pool command that creates an HADB JDBC connection pool. For more details about this command, see the Sun Java System Application Server Developer’s Guide to J2EE Services and APIs.
asadmin create-jdbc-connection-pool --user adminname --password secret --datasourceclassname com.sun.hadb.jdbc.ds.HadbDataSource --steadypoolsize=32 --isolationlevel=repeatable-read --isconnectvalidatereq=true --validationmethod=meta-data --property username=storename:password=secret456:serverList=host\\:port,host\\:port,host\\:p ort,host\\:port,host\\:port,host\\:port:cacheDatabaseMetaData=false:eliminateRedund antEndTransaction=true hadbpool
Note that colon characters (:) within property values must be escaped with double backslashes (\\) on Solaris platforms, because otherwise they are interpreted as property delimiters. On Windows platforms, colon characters (:) must be escaped with single backslashes (\). For details about using escape characters, see Using Escape Characters.
Creating a JDBC Resource
The following table summarizes JDBC resource settings required for the HADB.
Table 21-6 HADB JDBC Resource Settings
Setting
Description
JNDI Name
The following JNDI name is the default in the session persistence configuration: jdbc/hastore. You can use the default name or a different name.
You must also specify this JNDI name as the value of the store-pool-jndi-name Persistence Store property when you activate the availability service. See Referencing the HADB Database’s JDBC Resource.
Pool Name
Select from the list the name (or ID) of the HADB connection pool used by this JDBC resource. For more information, see Creating a Connection Pool.
Data Source Enabled
Checked/true
Managing the HADBIn general, management operations are not necessary unless you are replacing or upgrading your network, hardware, operating system, or HADB software. For assistance with these operations, call Sun customer support. See Using Sun Customer Support for the HADB. The following sections explain various management operations:
Starting a Node
If no node is running in the database, use hadbm clear to start the nodes even if you are running with inetd. See Clearing the HADB.
You may want to start a node in the following circumstances:
- If you have stopped a node, for example for hardware or software replacement. See Stopping a Node.
- If a node has stopped due to a hardware failure, after the hardware has been mended.
- If a node has stopped due to a software failure and the node was unable to recover automatically.
If a node has stopped due to a failure, starting a node is not necessary if you are running with inetd, because after the machine reboots, inetd automatically restarts the node.
In most cases, you should first attempt to start the node using the normal start level. You must use the repair start level if starting a node using the normal start level fails or times out.
To start a node in the database, use the hadbm startnode command. The syntax is as follows:
hadbm startnode [--startlevel=level] nodeno [dbname]
For example:
hadbm startnode 1
The hadbm startnode command options are listed in the following table.
Stopping a Node
You may want to stop a node if you want to replace hardware or software on the machine, and you need to stop the machine.
Caution
Do not stop a node if its mirror node is not running, because this may force the database into a Non Operational state. For details about node status, see Getting the Status of the HADB.
To stop a node in the database, use the hadbm stopnode command. The syntax is as follows:
hadbm stopnode [--no-repair] nodeno [dbname]
For example:
hadbm stopnode 1
The hadbm stopnode command options are listed in the following table.
Restarting a Node
You may want to restart a node if you notice strange behavior in a node (for example excessive CPU consumption) and want to check whether a restart cures the problem.
Caution
Do not stop a node if its mirror node is not running, because this may force the database into a Non Operational state. For details about node status, see Getting the Status of the HADB.
To restart a node in the database, use the hadbm restartnode command. The syntax is as follows:
hadbm restartnode [--startlevel=level] nodeno [dbname]
For example:
hadbm restartnode 1
The hadbm restartnode command options are listed in the following table.
Starting the HADB
To start a database, use the hadbm start command. The syntax is as follows:
hadbm start [dbname]
For example:
hadbm start
The default dbname is hadb, all lowercase.
This command starts all nodes that were running before the database was stopped. Individually stopped (offline) nodes are not started when the database is started after a stop.
Stopping the HADB
When you stop and start the HADB in separate operations, data is unavailable while the HADB is stopped. To keep data available, you can restart the HADB as described in Restarting the HADB.
You may want to stop the HADB in the following circumstances:
- If you want to remove the HADB database.
- If you want to perform system maintenance that affects all HADB nodes.
- Before executing the hadbm clear command to reinitialize the database. See Clearing the HADB.
Before stopping the HADB, you should either stop dependent Sun Java System Application Server instances or configure them to use a different persistence method. For details, see Chapter 20, "Configuring Session Persistence (Enterprise Edition)."
Note
If you stop the HADB with hadbm stop, you must start it with hadbm start, even if inetd is used, because inetd can’t start offline nodes.
To stop a database, use the hadbm stop command. The syntax is as follows:
hadbm stop [dbname]
For example:
hadbm stop
The default dbname is hadb, all lowercase. For more information about database states, see Getting the Status of the HADB.
When you stop the database, all the running nodes in the database are stopped and the status of the database is Stopped.
If you have set up inetd to automatically restart the HADB, you must perform these steps to stop the HADB:
Restarting the HADB
You may want to restart the HADB if you notice strange behavior in the HADB (for example consistent timeout problems) and want to check whether a restart cures the problem.
When you restart the HADB, data and database services remain available. When you stop and start the HADB in separate operations, data and database services are unavailable while the HADB is stopped. This is because hadbm restart performs a rolling restart of nodes: it stops and starts the nodes one by one. In contrast, hadbm stop stops all nodes simultaneously.
If an hadbm set command fails, restarting the HADB restores the previous configuration. For details about hadbm set, see Viewing and Modifying Configuration Attributes.
To restart a database, use the hadbm restart command. The syntax is as follows:
hadbm restart [--no-rolling] [dbname]
For example:
hadbm restart
The default dbname is hadb, all lowercase. By default, this command restarts each of the nodes in the database to the current state or a better state. If you specify the --no-rolling or -g option, this command restarts all nodes at once, with loss of service.
Listing Databases
To list all the databases that have been created, use the hadbm list command. The syntax is as follows:
hadbm list
Clearing the HADB
You may want to clear the HADB in the following circumstances:
- If you are creating a database that uses inetd. See Additional Steps for inetd.
- If the hadbm status command reveals that the database is Non Operational or that multiple nodes are in the Waiting state. See Getting the Status of the HADB.
- If you are recovering from session data corruption. See Recovering from Session Data Corruption.
The hadbm clear command stops the database nodes, clears the database devices, then starts the nodes. The syntax is as follows.
hadbm clear [--fast] [--spares=sparecount] --dbpassword=password | --dbpasswordfile=file [dbname]
For example:
hadbm clear --fast --spares=2 --dbpassword secret123
The hadbm clear command options are listed in the following table.
Table 21-10 hadbm clear Options
Long Form
Short Form
Default
Description
--fast
-F
not present
If present, skips device initialization while initializing the database. Do not use if the disk storage device is corrupted or if you have just created the database and set up inetd.
--spares
-s
previous number of spares
Specifies the number of spare nodes the reinitialized database will have. This number must be even and must be less than the number of nodes in the database. Spare nodes are optional, but having two or more ensures high availability.
--dbpassword
-p
none
Specifies the HADB system user password. You can use --dbpasswordfile instead. For details, see The hadbm Command.
--dbpasswordfile
-P
none
Specifies a file that stores the password for the HADB system user. For details, see The hadbm Command.
dbname
none
hadb
Specifies the database name.
Removing a Database
The database you want to remove must exist and must be in the Stopped state. See Stopping the HADB. To remove an existing database from the HADB system, use the hadbm delete command. The syntax is as follows:
hadbm delete [dbname]
For example:
hadbm delete
The default database name is hadb, all lowercase. When you execute this command, the configuration files, device files, log files, and history files of the database are deleted, and shared memory resources are freed.
Expanding the HADBIf you determine that your system performance is limited because the HADB cannot persist data fast enough, you can expand the HADB to increase throughput without shutting down your Sun Java System Application Server cluster or the HADB. This section describes how you can expand the HADB in the following sections:
You should also read Maintaining the HADB Machines.
Adding Storage Space to Existing Nodes
You may want to add storage space to the HADB in the following circumstances:
- If there is unused space on the disks on which the HADB nodes reside, or if you have upgraded these disks.
- If one of the following messages appears:
4592: No free blocks on data devices
4593: No unreserved blocks on data devices
- If the hadbm deviceinfo command reports insufficient free size. See Getting Device Information.
You can increase the device size in MB using either of the following hadbm set commands:
hadbm set DataDeviceSize=size
hadbm set TotalDatadeviceSizePerNode=size
For example:
hadbm set DataDeviceSize=1024
The TotalDatadeviceSizePerNode is equal to the DataDeviceSize multiplied by the NumberOfDatadevices. Therefore, TotalDatadeviceSizePerNode and DataDeviceSize are mutually dependent: changing one changes the other.
The DataDeviceSize should be as large as possible. The recommended size is four times the expected size of the user data, based on the number of users and the size of each user record.
Changing the DataDeviceSize or TotalDatadeviceSizePerNode on a database in a FaultTolerant or higher state means that the system is upgraded without loss of data, and the database remains in an Operational state during the reconfiguration. If you change device size on a system that is not FaultTolerant or better, data is lost. For more information about database states, see Database Status.
Adding Machines
You may want to add machines if the HADB requires more processing or storage capacity. For an explanation of node topology alternatives, see the Sun Java System Application Server System Deployment Guide.
To add a new machine on which to run the HADB, install the HADB packages with or without the Sun Java System Application Server as described in the Sun Java System Application Server Installation Guide.
Adding Nodes to the HADB
When you create new nodes and add them to the database, you increase processing and storage capacity. To add nodes, use the hadbm addnodes command. The syntax is as follows:
hadbm addnodes [--no-refragment] [--spares=sparecount] --dbpassword=password | --dbpasswordfile=file [--inetdsetupdir=path] --hosts=node-list [dbname]
For example:
hadbm addnodes --dbpassword secret123 --hosts n6,n7,n8,n9
After you have added nodes, you must perform these additional tasks:
For details, see Setting Up the JDBC Connection Pool.
Note
If you created the database using --inetd, you must do the following:
- Use the --no-refragment option and refragment the database in a separate step using the hadbm refragment command.
- Take extra steps to update the inetd configuration files to take into account the new nodes you add. For more information, see Additional Steps for inetd.
The hadbm addnodes command options are listed in the following table.
Table 21-11 hadbm addnodes Options
Long Form
Short Form
Default
Description
--no-refragment
-r
not specified
If specified, does not refragment the database during node creation; you can refragment the database later using the hadbm refragment command. For details about refragmentation, see Refragmenting the HADB.
You must use this option if you created the database using --inetd. In this case, you must refragment the database in a separate step using hadbm refragment.
If you do not have sufficient device space for a refragmentation, you can recreate the database with more nodes. See Adding Nodes Without Refragmenting.
--spares
-s
0
Specifies the number of new spare nodes in addition to those that already exist. This number must be even and must not be greater than the number of nodes added. Spare nodes are optional, but having two or more ensures high availability.
--dbpassword
-p
none
Specifies the HADB system user password. You can use --dbpasswordfile instead. For details, see The hadbm Command.
--dbpasswordfile
-P
none
Specifies a file that stores the password for the HADB system user. For details, see The hadbm Command.
--inetdsetupdir
-u
current directory
Specifies the directory in which to store the inetd setup files. The directory must exist on the machine and must be writable.
--hosts
-H
none
Specifies a comma-separated list of new host names for the new nodes in the database. One node is created for each comma-separated item in the list. The number of nodes must be even.
Using duplicate host names creates multiple nodes on the same machine with different port numbers. Make sure that nodes on the same machine are not mirror nodes.
Odd numbered nodes are in one DRU, even numbered nodes in the other. If --spares is used, new spare nodes are those with the highest numbers.
If the database was created with double network interfaces, the new nodes must be configured in the same way. See Configuring Double Networks.
dbname
none
hadb
Specifies the database name. The database must be in the HA Fault Tolerant or Fault Tolerant state. For more information about database states, see Getting the Status of the HADB.
Refragmenting the HADB
You must refragment the database before new nodes can store data. Refragmentation is required to store data evenly across all active nodes. To refragment the database, use the hadbm refragment command. The syntax is as follows:
hadbm refragment --dbpassword=password | --dbpasswordfile=file [dbname]
For example:
hadbm refragment --dbpassword secret123
Refragmentation requires that the user data size not exceed 50% of the space available for user data. For details, see Getting Device Information.
If this command fails even after multiple attempts, see Adding Nodes Without Refragmenting.
The hadbm refragment command options are listed in the following table.
Table 21-12 hadbm refragment Options
Long Form
Short Form
Default
Description
--dbpassword
-p
none
Specifies the HADB system user password. You can use --dbpasswordfile instead. For details, see The hadbm Command.
--dbpasswordfile
-P
none
Specifies a file that stores the password for the HADB system user. For details, see The hadbm Command.
dbname
none
hadb
Specifies the database name. The database must be in the HA Fault Tolerant or Fault Tolerant state. For more information about database states, see Getting the Status of the HADB.
Adding Nodes Without Refragmenting
If you don’t refragment the database when adding nodes, you must clear the database and recreate the session store instead, otherwise the session store can’t use the new nodes. You should not add nodes without refragmenting the database unless you can tolerate losing all data stored in the database. However, it may be the best alternative if all of the following conditions are met:
- You don’t have enough disk space to expand each node as described in Adding Storage Space to Existing Nodes.
- The user data size exceeds 50% of the space available for user data, which means you cannot refragment as described in Refragmenting the HADB.
- You are not passivating the session state.
To add nodes without refragmenting, perform the following tasks:
- Perform the following tasks for each server instance:
- Disable the server instance in the load balancer, as described in Known Issues in Load Balancing Requests.
- Disable session persistence as described in Enabling and Disabling Availability.
- Restart the server instance.
- Re-enable the server instance in the load balancer.
If you do not need to maintain availability, you can disable and reenable all the server instances at once in the load balancer. This saves time and prevents failover of outdated session data.
- Stop the database as described in Stopping the HADB.
- Delete the database as described in Removing a Database.
- Recreate the database with the additional nodes as described in Creating a Database.
- Reconfigure the JDBC connection pool as described in Setting Up the JDBC Connection Pool. You can also use the cladmin command. See Appendix F, "Using the cladmin Command for Administration (Enterprise Edition)."
- Reload the session persistence store as described in Creating the Session Store. You can also use the cladmin command.
- Perform the following tasks for each server instance:
- Disable the server instance in the load balancer.
- Enable session persistence as described in Enabling and Disabling Availability.
- Restart the server instance.
- Re-enable the server instance in the load balancer.
If you do not need to maintain availability, you can disable and reenable all the server instances at once in the load balancer. This saves time and prevents failover of outdated session data.
Monitoring the HADBYou can monitor the activities in the HADB by performing the following tasks:
These sections briefly describe the hadbm status, hadbm deviceinfo, and hadbm resourceinfo commands. For details about interpreting HADB information, see the Sun Java System Application Server Performance Tuning Guide.
Getting the Status of the HADB
To display the status of the database or its nodes, use the hadbm status command. The syntax is as follows:
hadbm status [--nodes] [dbname]
For example:
hadbm status --nodes
The physical node number is associated with a specific database node and port number combination, and does not vary during the life of the database. The logical node number, on the other hand, can vary during the lifetime of the database. Initially, logical node numbers are identical to physical node numbers for active nodes used to store data. Logical node numbering can change if individual nodes are stopped (for example, for maintenance), and spare nodes take over.
The hadbm status --nodes command gives information about both physical and logical node numbers. All other hadbm subcommands deal with physical node numbers only. You only need to know about logical node numbers if you need to know which nodes are currently mirror nodes. This information is useful when you are performing maintenance on machines. See Maintaining the HADB Machines.
The hadbm status command options are listed in the following table.
Table 21-13 hadbm status Options
Long Form
Short Form
Default
Description
--nodes
-n
not present
If present, displays node status information. See Node Status.
dbname
none
hadb
Specifies the database name.
Database Status
The possible states of a database are as follows:
- High-Availability Fault Tolerant (HAFT) - The database is fault tolerant and has at least one spare node on each DRU.
- Fault Tolerant (FT) - All the mirrored node pairs are up and running.
- Operational (O) - At least one node in each mirrored node pair is running.
- Non Operational (NO) - One or more mirrored node pairs is missing both nodes.
- Stopped (S) - No nodes are running in the database.
- Unknown (U) - The command cannot determine the state of the database.
If the database is Non Operational, clear the database using hadbm clear as described in Clearing the HADB.
Node Status
If you specify the --nodes option, the following information is displayed for each node in the database:
- Node number
- Name of the machine where the node is running
- Port number of the node
- Role of the node. For a list of possible roles and their meanings, see Roles of a Node.
- State of the node. For a list of possible states and their meanings, see States of a Node.
- Number of the corresponding mirror node.
A node’s role and state can change as described in these sections:
Roles of a Node
A node is assigned a role during its creation and can take any one of these roles:
- Active: An active node allows data storage and client access. Active nodes are in mirrored pairs.
- Spare: After having their data devices initialized, spare nodes monitor other data nodes to initiate repair if another node becomes unavailable. A spare node allows client access, but not data storage.
- Offline: A node is taken offline prior to stopping it to prevent restart by inetd. Offline nodes provide no services until their role changes. An offline node’s role can change back to its former role.
- Shutdown: An intermediate step between active and offline, which a node occupies while waiting for a spare node to take over its functioning. After the spare node has taken over, the node is taken offline.
States of a Node
A node can be in any one of the following states:
- Starting: The node is starting.
- Waiting: The node cannot decide its start level and is offline. If a single node is in this state for more than two minutes, stop the node and then start it at the repair level; see Stopping a Node and Starting a Node. If multiple nodes are in this state, clear the database as described in Clearing the HADB.
- Running: The node is providing all services that are appropriate for its role.
- Stopping: The node is in the process of stopping.
- Stopped: The node is inactive. Repair of a stopped node is prohibited.
- Recovering: The node is being recovered. When a node fails, the mirror node takes over the functions of the failed node. The failed node tries to recover by using the data and log records in main memory or on disk. The failed node uses the log records from the mirror node to catch up with the transactions performed when it was down. If recovery is successful, the node becomes active. If recovery fails, the node state changes to Repairing.
- Repairing: The node is being repaired. This operation reinitializes the node and copies the data and log records from the mirror node. Repair is more time consuming than recovery.
Getting Device Information
Monitoring the HADB involves making sure that there is enough free space for the growth of the database. To get information about disk storage devices on each active node, use the hadbm deviceinfo command. The syntax is as follows:
hadbm deviceinfo [--details] [dbname]
For example:
hadbm deviceinfo --details
The default dbname is hadb.
The information displayed for each node of the database includes:
To determine the space available for user data, take the total device size, then subtract 4 times the LogBufferSize. If you do not know the size of the log buffer, use the command hadbm get logbufferSize. For example, if the total device size is 128 MB and the LogBufferSize is 24 MB, the space available for user data is 128 – (4 x 24) = 32 MB.
The difference between the total device size and the free size is the user data size. If the data may be refragmented in the future, the user data size should not exceed 50% of the space available for user data. If refragmentation is not relevant, close to 100% may be used. Resource consumption warnings are written to the history files when the system is running short on device space.
For more information about tuning the HADB, see the Sun Java System Application Server Performance Tuning Guide.
If the --details option is specified, additional information is displayed:
For example:
NodeNO Totalsize Freesize Usage NReads NWrites DeviceName
0 128 120 6% 10000 5000 /var/opt/hadb.data.0
1 128 124 3% 10000 5000 /var/opt/hadb.data.1
2 128 126 2% 9500 4500 /var/opt/hadb.data.2
3 128 126 2% 9500 4500 /var/opt/hadb.data.3
If you need additional information, you can use the hadbm resourceinfo command. This command displays HADB runtime resource information that helps to identify resource contention, which you can use to reduce performance bottlenecks. For details, see the Sun Java System Application Server System Deployment Guide and the Sun Java System Application Server Performance Tuning Guide. The syntax is as follows:
hadbm resourceinfo [--databuf] [--locks] [--logbuf] [--nilogbuf] [dbname]
The following database information is displayed based on the options you specify:
For example, data buffer pool information is as follows:
NodeNO Avail Free Access Misses Copy-on-Write
0 256 128 100000 50000 1000
1 256 128 110000 45000 950
Locks information is as follows:
For example:
NodeNO Avail Free Waits
0 50000 20000 10
1 50000 20000 0
No more than 50% of the allocated locks are used for primary recording operations. The other 50% are reserved for hot standby recording operations. To change the NumberOfLocks, see Viewing and Modifying Configuration Attributes.
Log buffer information is as follows:
For example:
NodeNO Avail Free
0 16 2
1 16 3
Node internal log device information is as follows:
For example:
NodeNO Avail Free
0 16 2
1 16 3
Maintaining the HADB MachinesThe HADB achieves fault tolerance by replicating data on mirror nodes. Mirror nodes should be placed on separate DRUs in a production environment as described in HADB Architecture.
A failure is an unexpected event such as a hardware failure, power failure, or operating system reboot. The HADB tolerates single failures: of one node, one machine (that has no mirror node pairs), one or more machines belonging to the same DRU, or even one entire DRU. However, the HADB does not automatically recover from a double failure, which is the simultaneous failure of one or more mirror node pairs. If a double failure occurs, you must clear the HADB and recreate its session store, which erases all its data.
Installing the entire HADB on a single machine is recommended only for development and test environments, because in this case any failure except a single node failure is a double failure.
Caution
Before performing any maintenance, make sure you know which nodes are mirror nodes so you don’t shut down a mirror node pair and make the database Non Operational. See Getting the Status of the HADB.
Otherwise, to perform planned or unplanned maintenance on a single machine without interrupting HADB service:
- For planned maintenance, stop all nodes on the machine. See Stopping a Node.
- Perform the maintenance procedure and get the machine up and running.
- Start all nodes on the machine if either of the following is true:
- If you stopped all nodes manually in (more...), regardless of whether you are using inetd
- If you are not using inetd, regardless of how the nodes were stopped
See Starting a Node.
- Check whether the nodes are active and running. See Getting the Status of the HADB.
To perform planned maintenance on all HADB machines without interrupting HADB service:
To perform planned maintenance with HADB service interruption on all HADB machines, or when the entire HADB is on a single machine:
- Stop the HADB. See Stopping the HADB.
- Perform the maintenance procedure and get all the machines up and running.
- Start the HADB. See Starting the HADB. The data stored in the database before the stop is available again.
To perform unplanned maintenance in the event of a failure, first check the database status. See Getting the Status of the HADB.
- If the database state is Operational or better, this means the machines needing unplanned maintenance do not include mirror nodes. Follow the single machine procedure for each failed machine, one DRU at a time. HADB service is not interrupted.
- If the database state is Non-Operational, this means the machines needing unplanned maintenance include mirror nodes. One such case is when the entire HADB is on a single failed machine. Get all the machines up and running first. Then clear the HADB and recreate the session store. See Clearing the HADB and Creating the Session Store. This interrupts HADB service.
Viewing and Modifying Configuration AttributesYou can modify database configuration attributes. This section describes the following tasks:
Getting the Values of Configuration Attributes
To get the values of configuration attributes (for a list, see Configuration Attributes), use the hadbm get command. The syntax is as follows:
hadbm get attribute-list | --all [dbname]
For example:
hadbm get JdbcUrl,NumberOfSessions
The default dbname is hadb. The attribute-list is a comma-separated or quote-enclosed space-separated list of attributes. The --all option displays values for all attributes.
Setting the Values of Configuration Attributes
To set the values of configuration attributes (for a list, see Configuration Attributes), use the hadbm set command. The syntax is as follows:
hadbm set [dbname] attribute=value,attribute=value ...
The default dbname is hadb. The attribute-list is a comma-separated or quote-enclosed space-separated list of attributes.
If execution of this command is successful, the database is restarted in the state it was in previously, or in a better state. For information about database states, see Getting the Status of the HADB.
If execution of this command is unsuccessful, restart the HADB as described in Restarting the HADB.
The following attributes cannot be set by hadbm set, but can be set during database creation using --set or other options of hadbm create: ConfigPath, DatabaseName, DevicePath, HistoryPath, InstallPath, ManagementProtocol, NumberOfDatadevices, and Portbase. For information about hadbm create, see Creating a Database.
The JdbcUrl attribute value is derived from the --hosts and --portbase options during database creation with hadbm create and cannot be set by hadbm set or the --set option.
All other attributes listed in Table 21-14 can be set using hadbm set.
Configuration Attributes
The following table lists the configuration attributes that you can get and set. Except where noted, sizes are in MB, and times are in seconds.
Clearing and Archiving History FilesHADB history files contain a record of database operations and error messages. The location of these files is determined by the --historypath option of the hadbm create command. The default location is /var/tmp. These files have names of the format dbname.out.nodeno. For details about hadbm create, see Creating a Database.
These history files grow over time. To save space and prevent files from getting too large, you should periodically clear and archive older history files. To clear the history files of a database, use the hadbm clearhistory command. The syntax is as follows:
hadbm clearhistory [--saveto=path] [dbname]
The default dbname is hadb.
Use the --saveto or -o option to specify a directory if you want to store the old history files. This directory must have write permissions set.
Each message in the history file contains the following information:
Messages about resource shortages contain HIGH LOAD.
You do not need a detailed knowledge of all the various types of entries in the history file. If for any reason you need to study a history file in greater detail, you should obtain help from Sun customer support. See Using Sun Customer Support for the HADB.
Recovering from Session Data CorruptionThe following are indications that session data may be corrupted:
- Error messages appear in the Sun Java System Application Server system log (server log) every time you try to save the session state.
- Error messages are written to the server log indicating that the session could not be found or could not be loaded during session activation.
- Sessions that are activated after previously being passivated contain empty or incorrect session data.
- When an instance fails, failed-over sessions contain empty or incorrect session data.
- When an instance fails, instances that try to load a failed-over session cause an error in the server log indicating the session could not be found or could not be loaded.
To bring the session store back to a consistent state if you determine that the data has been corrupted, do the following:
- Clear the session store. For more information, see Clearing the Session Store.
- If clearing the session store doesn’t work or you continue to see errors in the server log, reinitialize the data space on all the nodes and clear the data in the database. See Clearing the HADB.
- If clearing the database doesn’t work, delete and then recreate the database. See Removing a Database and Creating a Database.
Using Sun Customer Support for the HADBBefore calling Sun customer support about HADB issues, you should gather as much of the following information about your system as possible:
Environment VariablesThis table lists environment variables that correspond to hadbm command options.