Creating and Configuring WebLogic Server Domains

 Previous Next Contents View as PDF  

Recovering Failed Servers

You can use several strategies to recover failed WebLogic Server instances in a domain. All of the strategies require that you back up a domain's configuration and security data.

This topic describes the following tasks:

 


Backing Up Configuration Data

Because the Administration Server uses its configuration files to administer the domain, we recommend that you keep an archived copy in case a failure of the Administration Server causes them to become unavailable. You can use any of the common methods of archiving, such as periodic back-ups, fault tolerant disks, and manually copying files whenever they are changed. If an Administration Server fails, you can copy your backup files onto a new machine and restart and Administration Server on the new machine.

By default, an Administration Server stores a domain's configuration data in a file called domain_name\config.xml, where domain_name is the root directory of the domain. Many of the configuration changes that you make to servers within the domain, either by using the Administration Console, the weblogic.Admin command, or the JMX API, are persisted in the config.xml file. For example, the MBeans that configure a WebLogic Server instance persist their data in config.xml.

Note: When you start a server, you can configure it to use a different configuration file. For more information, refer to Using the weblogic.Server Command in the WebLogic Server Administration Guide.

After an Administration Server successfully completes its startup sequence and is ready to process requests, it saves a copy of its configuration file in the following file:

domain_name/config.xml.booted

We recommend that you make a backup copy of config.xml and config.xml.booted. If you need to back out of any changes that you make to the configuration during a given session, you can revert to the configuration as defined in config.xml.booted.

In addition, you can configure one or more Managed Servers to replicate some of a domain's configuration data. A subsequent section in this topic, Replicating the Domain's Configuration Files, describes setting up this type of replication.

 


Backing Up Security Data

The Administration Server is responsible for maintaining security data for a domain. We recommend that you archive security data on the Administration Server in case it fails. You cannot modify security data unless an Administration Server is available.

This section describes the following tasks:

Backing Up Security Configuration Data

For each security provider that you use, the WebLogic Security framework instantiates a managed bean (MBean) to maintain the provider's configuration. To persist the configuration across server sessions, a domain creates an MBean repository in a directory named domain_name\userConfig\Security. When you start a server, it uses the data from the repository to create a runtime cache. The Administration Server periodically updates the userConfig\Security directory based on data in the cache. When you shut down a sever, it also updates the MBean repository.

You can use any of the following strategies to back up your security MBean repository, which is in a binary format:

The following sections describe how to back up and debug security data using XML files:

Dumping Security Configuration Data into an XML File

The WebLogicMBeanDumper utility reads the data in domain_name\userConfig and outputs an XML file. Because the utility works with data in the MBean repository, you do not need to start a server to use WebLogicMBeanDumper.

To dump data for all security-provider MBeans, do the following:

  1. Shut down the Administration Server to prevent the server's periodic updates to the MBean repository.

  2. Open a command shell.

  3. Navigate to the domain's root directory.

  4. To set the Java classpath, enter the following command:

    WL_HOME\server\bin\setWLSEnv.cmd (on Windows)
    WL_HOME/server/bin/setWLSEnv.sh (on UNIX)

    For more information, refer to Setting the Classpath in the WebLogic Server Administration Guide.

  5. Enter the following command:

    java weblogic.management.commo.WebLogicMBeanDumper
        -includeDefaults -name Security:*
    output-file

Listing 4-1 shows an excerpt of an output file that contains security MBean data. The XML format is as follows:

Listing 4-1 Security MBean Instances Formatted in XML

<?xml version="1.0" encoding="UTF-8"?>
<MBeans>
  <SetMBean 
      DisplayName="DefaultRoleMapper"
      ObjectName="Security:Name=myrealmDefaultRoleMapper"
      Type="weblogic.management.security.providers.authorization.DefaultRoleMap
      per"
  >
      <Attributes Realm="Security:Name=myrealm">
            <Defaulted AttributeNames="RoleDeploymentEnabled"/>
      </Attributes>
  </SetMBean>
...
</MBeans>

Modifying the Dumped XML File

You can make any of the following changes to the dumped XML file:

By default, the values of password attributes are not printed in clear text in the XML file but instead are masked as "*****". This avoids the security risk of printing passwords in clear text in the file system, but makes these files non-portable to different domains since the ***** mask strings cannot be intrepreted meaningfully. To make the XML file portable, you need to print clear text values to it, or else encrypted values that the new domain can decrypt. Do this using the -Dweblogic.management.commo.dumpFormat=<encrypted | cleartext> argument, which causes the following behavior:

When you load the modified XML file as described in the next section, the WebLogicMBeanLoader utility regenerates the repository based on the data in the file.

Loading XML Data into an MBean Repository

To replace the data in a domain's MBean repository with an XML file generated by WebLogicMBeanDumper, do the following:

  1. Shut down the Administration Server to prevent the server's periodic updates to the MBean repository.

  2. Under domain_name/userConfig, rename the Security directory and move it outside of the userConfig directory tree.

    Note: Make sure that you rename and move the directory instead of just making a copy. These steps assume that you want to replace the entire security MBean repository with the data in your dumped XML file. If a domain_name/userConfig/Security directory exists before you go to the next step, modifications to existing MBeans will succeed, any MBean additions will succeed, but no MBeans will be removed.

  3. To set the Java classpath, enter the following command:

    WL_HOME\server\bin\setWLSEnv.cmd (on Windows)
    WL_HOME/server/bin/setWLSEnv.sh (on UNIX)

    For more information, refer to Setting the Classpath in the WebLogic Server Administration Guide.

  4. From the domain's root directory, enter the following command:

    java weblogic.management.commo.WebLogicMBeanLoader XML-file

Creating a Default Security Data Repository

If you want to regenerate all security MBeans in a domain and return to the configuration as installed by WebLogic Server, do the following:

  1. Shut down the Administration Server to prevent the server's periodic updates to the MBean repository.

  2. Under domain_name/userConfig, rename the Security directory and move it outside of the userConfig directory tree.

  3. Start the Administration Server.

The Administration Server generates a new security MBean repository that uses the installed security providers with default values. To log on to the server with this installed configuration, you must provide the administrative user name that you specified when you created the domain. For more information, refer to Specifying an Initial Administrative User in the WebLogic Server Administration Guide.

Backing Up the WebLogic LDAP Repository

The default Authentication, Authorization, Role Mapper, and Credential Mapper providers that are installed with WebLogic Server store their data in an LDAP server. Each WebLogic Server contains an embedded LDAP server. The Administration Server contains the master LDAP server which is replicated on all Managed Servers. If any of your security realms use these installed providers, we recommend that you back up the following directory tree:

domain_name\adminServer\ldap

where domain_name is the domain's root directory and adminServer is the directory that the Administration Server generates to store runtime, security, and other data. Each WebLogic Server generates such a directory, but you only need to back up the LDAP data on the Administration Server.

For example, your security realm uses the default Authentication provider that is installed with WebLogic Server. If your Administration Server is named myAdminServer and your domain is named myDomain, back up the following directory tree:

myDomain\myAdminServer\ldap

Under the ldap directory, the ldapfiles subdirectory contains the data files for the LDAP server. Data files in this directory store user, group, group membership, policies, and role information. Other subdirectories of the ldap directory store such information as message logs for the LDAP server and data about replicated LDAP servers.

If you or someone else is modifying data for one of the security providers while you are backing up the ldap directory tree, it is possible that your backup of the files in the ldapfiles subdirectory will be in an inconsistent state. For example, if someone is using the installed default Authorization provider to add a user, the backup might start after the add has started, but before the add has completed.

Once a day, a server suspends write operations and creates its own backup of the LDAP data. It archives this backup in a ZIP file below the ldap\backup directory and then resumes write operations. This backup is guaranteed to be consistent, but it might not contain the latest security data.

For information about configuring this LDAP backup, refer to Configuring Backups for the Embedded LDAP Server in the Administration Console Online Help.

You do not need to back up the LDAP data on a Managed Server because the master LDAP server replicates the LDAP on each Managed Server as updates are made to the master server. If a domain's Administration Server is unavailable, the WebLogic security providers cannot modify security data. (The LDAP repositories on Managed Servers are replicas and therefore cannot be modified.)

Backing Up SerializedSystemIni.dat and Security Certificates

All servers create a file named SerializedSystemIni.dat and locate it in the server's root directory. This file contains encrypted security data that must be present to boot the server. You must back up this file.

If you configured a server to use SSL, you must also back up the security certificates and keys. The location of these files is user-configurable.

 


Restarting an Administration Server When Managed Servers are Running

If the Administration Server shuts down while Managed Servers continue to run, you do not need to restart the Managed Servers that are already running in order to recover management of the domain. The procedure for recovering management of an active domain depends upon whether you can restart the Administration Server on the same machine it was running on when the domain was started.

This section describes the following tasks:

Restarting an Administration Server on the Same Machine

If you restart the WebLogic Administration Server while Managed Servers continue to run, by default the Administration Server can discover the presence of the running Managed Servers.

Note: Make sure that the startup command or startup script does not include -Dweblogic.management.discover=false, which disables an Administration Server from discovering its running Managed Servers. For more information about -Dweblogic.management.discover, refer to Frequently Used Optional Arguments in the WebLogic Server Administration Guide.

The root directory for the domain contains a file running-managed-servers.xml which is a list of the Managed Servers that the Administration Server knows about. When the Administration Server starts, it uses this list to check for the presence of running Managed Servers.

Restarting the Administration Server does not cause Managed Servers to update the configuration of static attributes. Static attributes are those that a server refers to only during its startup process. WebLogic Servers must be restarted to take account of changes to static configuration attributes. Discovery of the Managed Servers only enables the Administration Server to monitor the Managed Servers or make runtime changes in attributes that can be configured while a server is running (dynamic attributes).

Restarting an Administration Server on Another Machine

If a machine crash prevents you from restarting the Administration Server on the same machine, you can recover management of the running Managed Servers as follows:

  1. Install the WebLogic Server software on the new administration machine (if this has not already been done).

  2. Make your application files available to the new Administration Server by copying them from backups or by using a shared disk. Your application files should be available in the same relative location on the new file system as on the file system of the original Administration Server.

  3. Make your configuration and security data available to the new administration machine by copying them from backups or by using a shared disk. For more information, refer to Backing Up Configuration Data and Backing Up Security Data.

  4. Restart the Administration Server on the new machine.

Note: Make sure that the startup command or startup script does not include -Dweblogic.management.discover=false, which disables an Administration Server from discovering its running Managed Servers. For more information about -Dweblogic.management.discover, refer to Frequently Used Optional Arguments in the WebLogic Server Administration Guide.

When the Administration Server starts, it communicates with the Managed Servers and informs them that the Administration Server is now running on a different IP address.

 


Starting a Managed Server When the Administration Server Is Not Accessible

Usually when a Managed Server starts, it contacts the Administration Server to retrieve its configuration information. If a Managed Server is unable to connect to the specified Administration Server during startup, it can retrieve its configuration by reading a configuration file and other files directly.

A Managed Server that starts in this way is running in Managed Server Independence mode. In this mode, a server uses its cached application files to deploy the applications that are targeted to the server. You cannot change a Managed Server's configuration until it is able to restore communication with the Administration Server.

This section contains the following subsections:

Starting in Managed Server Independence Mode

If Managed Server Independence Mode is enabled (which is the default setting for a server), and if the Administration Server is unavailable when you start a Managed Server, a Managed Server looks in its root directory for the following files:

By default, a server assumes that its root directory is the directory from which you issue the server startup command.

If a Managed Server runs in the same domain and on the same machine as the Administration Server, by default it shares its root directory with the Administration Server. If you start such a Managed Server, it will automatically find the configuration files.

If a Managed Server does not share its root directory with the Administration Server, you can do any of the following:

Note: If you set up SSL for your servers, each server requires its own set of certificate files, key files, and other SSL-related files. Managed Servers do not retrieve SSL-related files from the Administration Server (though the domain's configuration file does store the pathnames to those files for each server). Starting in Managed Server Independence Mode does not require you to copy or move the SSL-related files unless they are located on a machine that is inaccessible.

The Node Manager and Managed Server Independence

You cannot use the Node Manager to start a server in Managed Server Independence mode. The Node Manager requires the presence of the Administration Server. If the Administration Server is unavailable, you must log on to a local host to start a Managed Server. For more information about the Node Manager, refer to Managing Server Availability with Node Manager.

MSI Mode and the Managed Servers Root Directory

By default, a server instance assumes that its root directory is the directory from which it was started. For more information about a server's root directory, refer to "A Server's Root Directory".

If you enable replication of configuration data, as described in Replicating the Domain's Configuration Files, and if you have started the Managed Server at least once while the Administration Server was running, msi-config.xml and SerializedSystemIni.dat will already be in the server's root directory. The boot.properties file is not replicated. If it is not already in the Managed Server's root directory, you must create one. For more information, see "Bypassing the Prompt for Username and Password"in the WebLogic Server Administration Guide.

If msi-config.xml and SerializedSystemIni.dat are not in the root directory, you can either:

MSI Mode and the Domain Log File

Each WebLogic Server instance writes log messages to its local log file and a domain-wide log file. The domain log file provides a central location from which to view messages from all servers in a domain.

Usually, a Managed Server forwards messages to the Administration Server, and the Administration Server writes the messages to the domain log file. However, when a Managed Server runs in MSI mode, it assumes the role of writing to the domain log file.

By default, the pathnames for local log files and domain log files are relative to the Manged Server's root directory. With these default settings, if a Managed Server is located in its own root directory (and it does not share its root directory with the Administration Server), when it runs in MSI mode the Managed Server will create its own domain log file in its root directory.

If a Managed Server shares its root directory with the Administration Server, or if you specified an absolute pathname to the domain log, the Managed Server in MSI mode will write to the domain log file that the Administration Server created.

Note: The Managed Server must have permission to write to the existing file. If you run the Administration Server and Managed Servers under different operating system accounts, you must modify the file permissions of the domain log file so that both user accounts have write permission.

Contacting the Security Realm

In addition to the configuration files, a server must also have access to a security realm to complete its startup process.

If you use the security realm that WebLogic Server installs, then the Administration Server maintains an LDAP server to store the domain's security data. All Managed Servers replicate this LDAP server. If the Administration Server fails, Managed Servers running in Managed Server Independence mode can use the replicated LDAP server for security services.

If you use a third party security provider, then the Managed Server must be able to access the security data before it can complete its startup process.

Replicating the Domain's Configuration Files

Managed Server Independence mode includes an option that copies the required configuration files from the Administration Server's root directory into the Managed Server's root directory every 5 minutes. Depending on your backup schemes and the frequency with which you update your domain's configuration, this option might not be worth the performance cost of copying potentially large files across a network.

Before you can use this feature, you must initialize the required environment:

Caution: Do not enable file replication for a server that shares an installation or root directory with another server. Unpredictable errors can occur for both servers.

  1. Start the domain's Administration Server.

  2. Configure the Managed Server to replicate the domain's configuration files.

    See Replicating a Domain's Configuration Files in the Administration Console Online Help.

  3. Leave the Administration Server running for at least 5 minutes.

After the Managed Server contacts the Administration Server and copies the configuration files to its own root directory, the Managed Server can use its copy of these files when starting in Managed Server Independence mode.

This option does not replicate a boot identity file. (For more information about boot identity files, refer to Bypassing the Prompt for Username and Password in the WebLogic Server Administration Guide.)

How a Managed Server Restores Communication with an Administration Server

When the Administration Server starts, it can detect the presence of running Managed Servers (if -Dweblogic.management.discover=true, which is the default setting for this property). Upon startup, the Administration Server looks at a persisted copy of the file running-managed-servers.xml and notifies all the Managed Servers of its presence. If the Managed Server is running in Managed Server Independence mode, it deactivates this self-administering mode and registers itself to the Administration Server for future configuration change notifications.

For information about restarting the Administration Server in this scenario, see Restarting an Administration Server When Managed Servers are Running.

Disabling Managed Server Independence

By default, Managed Server Independence mode is enabled. For information about disabling the mode, refer to Disabling Managed Server Independence in the Administration Console online help.

 


Self-Health Monitoring and Restart for Managed Servers

WebLogic Server 7.0 provides a new self-health monitoring feature to improve the reliability and availability of servers in a domain. Selected subsystems within each WebLogic Server monitor their health status based on criteria specific to the subsystem. (For example, the JMS subsystem monitors the condition of the JMS thread pool while the core server subsystem monitors default and user-defined execute queue statistics.) If an individual subsystem determines that it can no longer operate in a consistent and reliable manner, it registers its health state as "failed" with the host server.

Each WebLogic Server, in turn, checks the health state of all its registered subsystems to determine the overall viability of the server. If the server finds one or more critical subsystems have reached the FAILED state, the server marks its own health state as FAILED to indicate that the it cannot reliably host an application.

When used in combination with the Node Manager application, server self-health monitoring enables you to automatically reboot servers that have failed. This improves the overall reliability of a domain, and requires no direct intervention from an Administrator. See Managing Server Availability with Node Manager for more information.

 

Back to Top Previous Next