2 Management and Monitoring

This chapter describes management and monitoring best practices for Oracle Application Server. It includes the following topics:

Section 2.1, "Oracle Enterprise Manager 10g Best Practices"
Section 2.2, "Oracle Process Manager and Notification Server Best Practices"
Section 2.3, "Distributed Configuration Management Best Practices"
Section 2.4, "Dynamic Monitoring Services Best Practices"

2.1 Oracle Enterprise Manager 10g Best Practices

This section describes best practices for Oracle Enterprise Manager 10g. It features the following topics:

Section 2.1.1, "Select the Framework Options That Best Suit Your Needs"
Section 2.1.2, "Application Server Control Console"
Section 2.1.3, "Grid Control Console"

2.1.1 Select the Framework Options That Best Suit Your Needs

There are ways to deploy Enterprise Manager in order to give you the flexibility to select the configuration that best suits your needs. If you are working in a simple development or test environment, or if you have a single Oracle Application Server 10g instance to manage, you can use Oracle Enterprise Manager 10g Application Server Control Console (Application Server Control Console), which is available with any Oracle Application Server middle-tier installation. Application Server Control Console enables you to directly access all the pages for managing and monitoring the instance.

In a production environment, you typically manage a wider variety of software and hardware components. For example, you need to manage the databases and host computers that support your Web applications. For your production environment, you should use Oracle Enterprise Manager 10g Grid Control Console. The Grid Control Console provides you with a central location from which you can manage your Oracle Application Server instances, your databases, and your entire Oracle environment. Grid Control Console also supports sharing of information between administrators.

Implementation Details

See Also:

Oracle Enterprise Manager Concepts for further information about the Application Server Control Console and Grid Control Console

2.1.2 Application Server Control Console

This section contains the following topics:

Section 2.1.2.1, "Use the Deployment Wizard to Deploy Applications"
Section 2.1.2.2, "Use Clusters for Application Deployment and Configuration Management to Simplify Management of Application Servers"
Section 2.1.2.3, "Monitor Application Performance During Application Development or Test Cycles to Identify Resource Usage and Identify Bottlenecks"
Section 2.1.2.4, "Use the Host Home Page to Help Diagnose Performance Issues"
Section 2.1.2.5, "Perform Configuration Changes in Application Server Control to Ensure the Repository is Properly Updated"
Section 2.1.2.6, "Monitor Rate and Aggregated Performance Metrics to Identify Slow Requests"

2.1.2.1 Use the Deployment Wizard to Deploy Applications

A simple way to deploy an application is to use the Oracle Enterprise Manager 10g deployment wizard, which you can access from the Application Server Control Console. The wizard walks you systematically through all the essential deployment options to ensure that your application is deployed correctly.

Implementation Details

See Also:

"Deploying a New OC4J Application" in the Application Server Control Console Online Help

2.1.2.2 Use Clusters for Application Deployment and Configuration Management to Simplify Management of Application Servers

Using OracleAS Clusters simplifies management and maintenance of your application servers. Clustering enforces consistent configurations across all members of the cluster. If you want to make a configuration change in every instance, you only need to make the change once. The clustering mechanism ensures that the new configuration is propagated to all members.

Similarly, clustering also enforces consistency of deployed applications across all application server instances. If you wish to deploy a new application or update an existing deployment on every application server instance in the cluster, you only need to deploy or update the application once. The clustering mechanism ensures that the application is properly deployed to all members.

Implementation Details

See Also:

"About Managing Oracle Application Server Clusters" in the Application Server Control Console Online Help

2.1.2.3 Monitor Application Performance During Application Development or Test Cycles to Identify Resource Usage and Identify Bottlenecks

During application development and testing, you can use the Application Server Control Console to monitor the application's resource usage and identify bottlenecks. For example, during a performance or load test you can view memory and CPU use for the Oracle Application Server instance overall and for the application. You can also drill down to find sessions, modules, EJBs, and methods that may be bottlenecks in the application.

Implementation Details

See Also:

"Maintaining OC4J Applications" in the Application Server Control Online Help

2.1.2.4 Use the Host Home Page to Help Diagnose Performance Issues

The Application Server Control Console home page not only displays critical performance data and resource usage for the application server instance, it also includes a link to information for the host. For example, if your application server is performing poorly you can first drill down to the related Host home page to determine if the underlying problem is due to resource problems with the host and other processes, or to services running on the computer.

Implementation Details

See Also:

"Obtaining Information about the Host Computer " in the Application Server Control

2.1.2.5 Perform Configuration Changes in Application Server Control to Ensure the Repository is Properly Updated

When you edit the configuration of Oracle Application Server components including Oracle HTTP Server, OC4J, or OPMN, you should do so using Application Server Control Console. Enterprise Manager ensures that your configuration changes are updated in the repository. If you edit these configuration files manually, you must use the command dcmctl updateConfig to notify the DCM repository of the changes.

Implementation

See Also:

Appendix A, "dcmctl Commands," in the Distributed Configuration Management Administrator's Guide for further information about the dcmctl updateConfig command

2.1.2.6 Monitor Rate and Aggregated Performance Metrics to Identify Slow Requests

Enterprise Manager home pages and drill downs include rate and aggregated performance data that are not available through command line or other tools. For example, you can use Enterprise Manager to view average processing time for a HTTP request, allowing you to zero in on specific requests that may be slow.

Enterprise Manager also displays performance information, such as average processing time for a servlet for the most recent five minutes, in addition to averages since startup. This information enables you to more easily diagnose problems in real time.

2.1.3 Grid Control Console

This section contains the following topics:

Section 2.1.3.1, "Use Alerts and Notifications to Proactively Monitor System Availability"
Section 2.1.3.2, "Set Up Grid Control Console to Monitor for Availability and Performance Issues"
Section 2.1.3.3, "Add OracleAS Farms and OracleAS Clusters to Centrally Manage Application Server"
Section 2.1.3.4, "Use End-User Performance Monitoring to Monitor Response Times of Web Pages"
Section 2.1.3.5, "Proactively Monitor Web Application Transactions to Test Performance Monitoring"
Section 2.1.3.6, "Use Diagnostics to Pinpoint OC4J Performance Problems"
Section 2.1.3.7, "Use Job System to Schedule a Deployment"
Section 2.1.3.8, "Regularly Perform Backups to Prepare for Loss of Data"
Section 2.1.3.9, "Use Grid Control to Manage Both Oracle Application Server and the Oracle Database"
Section 2.1.3.10, "Manage Multiple Oracle Application Server Instances on a Single Host to Reduce Resource Usages"

2.1.3.1 Use Alerts and Notifications to Proactively Monitor System Availability

Grid Control Console enables you to monitor your systems for specific conditions, such as loss of service or poor performance. When such a condition exists, Enterprise Manager generates an alert, which displays automatically on the appropriate Enterprise Manager home pages. In addition, you can also be notified through email or pager. Minimally, you should set up the Grid Control Console to alert you when your critical or production application servers are unavailable.

You can also configure the Grid Control Console to notify specific administrators when an event condition occurs. This feature simplifies cooperation between administrators who share responsibility for the same systems.

Implementation Details

See Also:

Chapter 12, "Configuring Notifications," in Oracle Enterprise Manager Advanced Configuration

2.1.3.2 Set Up Grid Control Console to Monitor for Availability and Performance Issues

Once you have set up Grid Control Console to monitor for availability and performance issues, you will be alerted when a problem is detected. If Enterprise Manager detects that an application server component is unavailable, you can use Application Server Control Console to check the status of the component and restart it if desired. If a performance issue was detected, with a component or application, you can drill down to the component home page and view detailed performance and diagnostic information. You can also drill down from the Oracle Application Server Containers for J2EE (OC4J) home page to find applications, modules, and methods. Using these drill downs, you can diagnose and resolve performance issues.

Implementation Details

See Also:

"Viewing An Application Server at a Glance" and "Viewing the Performance of Your Application Server" in the Grid Control Console Online Help

2.1.3.3 Add OracleAS Farms and OracleAS Clusters to Centrally Manage Application Server

If you use Oracle Application Server Farms and OracleAS Clusters in your environment and you use Grid Control Console, you should add the Oracle Application Server Farms and Clusters to Grid Control for central management. When adding the farm, Oracle recommends using the same farm name as what the Application Server Control Console shows so that the two consoles remain consistent as you are navigating from one to another to perform various tasks. After the farm and cluster is added, you can monitor the members of the farm and clusters, and perform common administrative tasks such as starting/stopping/restarting members; scheduling jobs to automate commonly-executed tasks against the farm or cluster, or creating blackouts to perform scheduled maintenance on the farm or cluster.

Implementation Details

See Also:

"Adding Oracle Application Server Farms "in the Grid Control Console Online Help

2.1.3.4 Use End-User Performance Monitoring to Monitor Response Times of Web Pages

To monitor the actual performance of your Web application as experienced by your end-users, use the End-User Performance Monitoring feature for Web applications. The End-User Performance Monitoring feature of Enterprise Manager enables you to view and analyze the actual request response times for all Web pages accessed by all your end-users. You can assess the impact of a performance problem on your end-user base, or view page performance data by visitor, domain, region, or Web server, or by a combination of these axes. Also, you can highlight the monitoring of the most critical pages of your Web application by setting up a Watch List.

The End-User Performance Monitoring option requires configuration of OracleAS Web Cache, Apache HTTP Server 2.0, or standalone Oracle HTTP Server 2.0 to instrument end-user performance data.

Implementation Details

See Also:

Section 6.8, "Configuring End-User Performance Monitoring," in Oracle Enterprise Manager Advanced Configuration

2.1.3.5 Proactively Monitor Web Application Transactions to Test Performance Monitoring

Enterprise Manager provides a proactive approach to monitoring Web applications through test performance monitoring. Synthetic business transactions, or service tests, are created using the transaction recorder, and are then replayed and monitored at specified intervals from key representative user communities called beacons. Measure the response times of key business transactions from various geographical user communities using this feature. Use test performance monitoring to:

Isolate server-side problems from network delays
Profile how much time is spent connecting to the server
Document its first byte time
Time spent serving HTML and non-HTML content

Alerts will notify you when transaction response time thresholds have been exceeded.

Implementation Details

See Also:

Section 6.4.4, "Service Tests and Beacons," in Oracle Enterprise Manager Advanced Configuration to configure and enable this option

2.1.3.6 Use Diagnostics to Pinpoint OC4J Performance Problems

Enterprise Manager provides comprehensive diagnostics that enable you to quickly pinpoint Oracle Application Server Containers for J2EE (OC4J) performance problems within the middle tier. To determine performance bottlenecks within your application, use the Web Application Request Performance page to identify the slowest requests by OC4J processing time. Each request is broken down by JSP, servlet, EJB and JDBC processing times. By traversing through the invocation paths of the processing call stack down to the SQL statement level, you can quickly identify the source of the bottlenecks causing application slowdowns. Use the application correlation feature to determine whether other system level problems have attributed to performance bottlenecks.

In addition, you can trace all invocation paths of a Web application starting at the transaction level on an on-demand basis, and diagnose performance problems across all tiers: from the network, through the middle tier (including JSP servlet, EJB, and JDBC times) down to the SQL statement level.

If the performance bottleneck is found to be SQL, then in context you should launch the SQL Tuning Advisor to schedule analysis and tuning of the SQL statement in question. After analyzing the SQL statements, the advisor provides advice on optimizing the execution plan, the rationale for the proposed optimization, the estimated performance benefit, and the command to implement the advice.

See Also:

Section 6.9.1, "Selecting OC4J Targets for Request Performance Diagnostics," in Oracle Enterprise Manager Advanced Configuration

2.1.3.7 Use Job System to Schedule a Deployment

In some cases, you may want to deploy an application during off-hours or at a certain scheduled time. You can use the Enterprise Manager job system to schedule a deployment to occur at a selected time. Simply create a script containing the DCM command-line dcmctl deployApplication command and schedule the script with the Enterprise Manager job system.

Implementation Details

See Also:

Appendix A, "dcmctl Commands," in the Distributed Configuration Management Administrator's Guide for information about the deployApplication command
Chapter 8, "Job System," in Oracle Enterprise Manager Concepts for more information on how to the use the Enterprise Manager job system

2.1.3.8 Regularly Perform Backups to Prepare for Loss of Data

You should perform regular backups of your application server, and when you backup you should consider your entire application server environment. For example, you should not back up your middle-tier installation on Monday and your Infrastructure on Tuesday. If you did, you would not be able to restore your environment to a consistent state. Instead, you should back up your entire Oracle Application Server environment at once. Then, if a loss occurs, you can restore your entire environment to a consistent state. Ideally, you should perform application server backups after every administrative change, or, if this is not possible, on a regular basis, perform an instance backup of your Oracle Application Server environment. This backup enables you to restore your environment to a consistent state as of the time of your most recent configuration and metadata backup. To avoid an inconsistent backup, do not make any configuration changes until backup completes for all Oracle Application Server instances.

You can use Grid Control to perform immediate Application Server backups or backups on a scheduled basis. For instance, you can schedule a weekly full, cold backup for Sunday nights and then an incremental, online backup for other nights. You should also take backups after installation as well any major configuration change.

Implementation Details

See Also:

"Scheduling a Backup of Oracle Application Servers" and "About Backing Up and Recovering Oracle Application Servers" in the Grid Control Console Online Help

2.1.3.9 Use Grid Control to Manage Both Oracle Application Server and the Oracle Database

If you plan to manage both your Oracle Application Server instances and your Oracle database from the same management console, install the latest version of Grid Control. This install will ensure that you have the most up-to-date functionality for managing both types of targets.

2.1.3.10 Manage Multiple Oracle Application Server Instances on a Single Host to Reduce Resource Usages

By default, each Oracle Application Server instance on a host has its own Application Server Control Console, which manages the components of that particular Oracle Application Server instance.

If you have installed multiple Oracle Application Server instances on a single host, you can optionally reduce the memory and CPU consumption by performing a postinstallation configuration procedure to configure a single Application Server Control Console to manage two Oracle Application Server instances installed on the same host.

Implementation Details

See Also:

Section A.8, "Managing Multiple Oracle Application Server Instances on a Single Host," in Oracle Application Server Administrator's Guide

2.2 Oracle Process Manager and Notification Server Best Practices

This section describes Oracle Process Manager and Notification (OPMN) Server best practices. It includes the following topics:

Section 2.2.1, "Start OPMN to Manage Components"
Section 2.2.2, "Never Start or Stop OPMN Managed Components Manually"
Section 2.2.3, "Review stdout and stderr to Determine Cause of Components Not Starting"
Section 2.2.4, "Increase Timeout For Components to Avoid Timed-Out Requests"
Section 2.2.5, "Set Retry to High Values For Components Running on an Overloaded System to Avoid Restart of Computer"
Section 2.2.6, "Leverage Additional Logging to Aid in Debugging"
Section 2.2.7, "Configure Log Rotation to Avoid Log File Size Issues"
Section 2.2.8, "Configure Additional Start Order Dependencies to Customize Startup"
Section 2.2.9, "Use Event Scripts to Record Important Events"
Section 2.2.10, "Use OPMN to Manage External Components"

2.2.1 Start OPMN to Manage Components

Start the OPMN server as soon as possible after turning on the host. OPMN must be running whenever OPMN-managed components are turned on or off. OPMN must be the last service turned off whenever you reboot or turn off your computer.

2.2.2 Never Start or Stop OPMN Managed Components Manually

Oracle Application Server components managed by OPMN should never be started or stopped manually. Do not use command line scripts or utilities from previous versions of Oracle Application Server for starting and stopping Oracle Application Server components.

Implementation Details

To implement this best practice, use the Application Server Control or the opmnctl command line utility to start or stop Oracle Application Server components.

2.2.3 Review stdout and stderr to Determine Cause of Components Not Starting

The process-specific console logs are the first and best resource for investigating problems related to starting and stopping components. OPMN creates a log file for each component and assigns a unique concatenation of the Oracle Application Server component with a number. For example, the standard output log for OracleAS Web Cache may be WebCache~WebCacheAdmin~1.

Implementation Details

To implement this best practice, review the standard output (stdout) and standard error (stderr) of OPMN managed processes are reported in the log file in available in the ORACLE_HOME/opmn/logs directory.

The stdout and stderr log files are reused and appended to when a component is restarted so these files can contain output from multiple invocations of a component.

2.2.4 Increase Timeout For Components to Avoid Timed-Out Requests

The time it takes to execute an opmnctl command is dependent on the type of Oracle Application Server process and available computer hardware. Because of this the time it takes to execute an opmnctl command may not be readily apparent. For example, the default start time out for OC4J is approximately five minutes. If an OC4J process does not start-up after an opmnctl command, OPMN will wait approximately an hour before timing out and aborting the request.

Increase the start element timeout attribute for the component that takes a long time to start. Similarly, increase the stop element timeout attribute in opmn.xml for the component that takes a long time to stop.

Implementation Details

Set the timeout in the opmn.xml file at a level that will allow OPMN to wait for process to come up.

2.2.5 Set Retry to High Values For Components Running on an Overloaded System to Avoid Restart of Computer

Pings occur periodically between OPMN and the components that it manages to ensure that each component is not unresponsive and is capable of servicing requests. Ping failures result in a certain number of retry attempts and multiple failures in a row result in a restart of the component. On overloaded systems, it may be necessary to increase the number of retry attempts made before restarting the component.

Implementation Details

To implement this best practice:

In the opmn.xml file, locate <start> and <restart> elements.
Set the retry attribute in the appropriate element to a value greater than what is needed for the component to be pinged successfully.

This attribute specifies the number of times to retry a ping attempt before a component is considered hung.

2.2.6 Leverage Additional Logging to Aid in Debugging

OPMN provides different levels of logging. In a typical production mode, set the log level to the minimum level. When you are having a problem related to OPMN, prior to contacting Oracle Support, try leveraging additional logging to aid in debugging the problem.

OPMN provides different levels of logging. In a typical production mode, set the log level to a minimum.

Implementation Details

When you are having an OPMN-related problem, perform these steps prior to contacting technical support:

In the opmn.xml file, set the level attribute of <log-file> element for both <notification-server> and <process-manager> elements to 8 or 9.
Execute the $ORACLE_HOME/opmn/bin/opmnctl debug command and save the output to a file.
Save a copy of all logs in the $ORACLE_HOME/opmn/logs directory.

The file at this log level contains valuable information to assist in debugging.

2.2.7 Configure Log Rotation to Avoid Log File Size Issues

OPMN can rotate notification server log file (ons.log) and process manager log file (ipm.log) based on size, time or both. When the log file reaches the configured size or at the given hour of the day, the OPMN logging mechanism will close the file, rename it with a time stamp suffix, and then create a new log file. By default, log files are configured to be rotated based on size (1500000 KB), but when necessary, explicitly set rotation attributes for your environment.

OPMN can rotate managed process console log files too, for example, $ORACLE_HOME/opmn/logs/HTTP_Server~1 file for Oracle HTTP Server. At process startup, before handing off an existing console log file to a managed process, OPMN checks the size against a configured limit (rotation-size attribute of <log-file> element of <process-manager> element), and if the file size exceeds the limit, it will rename the existing file to include a time stamp, and then creates a new file for the managed process. If the rotation-size attribute is not configured, OPMN will not be able rotate the process console log file.

Having a proper rotation plan ensures OPMN and OPMN managed process starting and running without log file size issues, for limitation of 2 GB file size on some operating systems.

Implementation Details

To enable log rotation, configure the rotation-size and rotation-hour attributes of the <log-file> element for both <notification-server> and <process-manager> elements.

2.2.8 Configure Additional Start Order Dependencies to Customize Startup

OPMN is configured at installation with default start order dependencies, which enables you to start all of the components in an instance in a specific order with a single command. But if a specific component requires that other components and services are up and running before it starts, you can configure additional dependencies according to the environment.

2.2.9 Use Event Scripts to Record Important Events

You can configure OPMN to execute your own custom event scripts whenever a particular component starts, stops, or fails. It is useful to use one or more of the following event types:

pre-start: OPMN runs the pre-start script after any configured dependency checks have been performed and passed, and before the Oracle Application Server component starts. For example, you can use the pre-start script for site-specific initialization of external components.
pre-stop: OPMN runs the pre-stop script before stopping a designated Oracle Application Server component. For example, you can use the pre-stop script for collecting Java Virtual Machine stack traces prior to stopping OC4J processes.
post-crash: OPMN runs the post-crash script after the Oracle Application Server component has terminated unexpectedly. For example, a user could learn of component crashes by supplying a script or program to be executed at post-crash events, which sends a notification to the administrator's pager."

Implementation Details

See Also:

Appendix A, "OPMN Troubleshooting," in the Oracle Process Manager and Notification Server Administrator's Guide for a sample pre-start script

2.2.10 Use OPMN to Manage External Components

OPMN has the ability to manage arbitrary daemon processes that are not part of your Oracle Application Server installation. You can even create more sophisticated process management services by supplying the opmn.xml file the optional paths to scripts for stopping, restarting, and pinging the daemon process.

Implementation Details

Here is a simple example of an opmn.xml configuration for a custom component. The following lines load and identify the custom process module:.

<module path="%ORACLE_HOME%/opmn/lib/libopmncustom.so">
   <module-id id="CUSTOM" />
</module>

The following lines represent the minimum configuration for a custom process:

<ias-component id="Custom">
   <process-type id="Custom" module-id="CUSTOM">
      <process-set id="Custom" numprocs="1">
         <module-data>
            <category id="start-parameters">
               <data id="start-executable" value="Your start executable here" />
            </category>
         </module-data>
      </process-set>
   </process-type>
</ias-component>

2.3 Distributed Configuration Management Best Practices

This section describes best practices for Distributed Configuration Management (DCM). It contains the following topics:

Section 2.3.1, "Use DCM Archiving to Take Snapshots of Configuration"
Section 2.3.2, "Specify a Single Instance in a Cluster as the Management Point to Provide A Correct Order of Operations"
Section 2.3.3, "Avoid Concurrent Administration Operations to Prevent Configuration Conflicts"
Section 2.3.4, "Avoid Running updateConfig Concurrently with Any Other Configuration Operation to Prevent Configuration Conflicts"
Section 2.3.5, "Restart Application Server Control Console after Joining or Leaving a Farm or Cluster to Refresh the Console"
Section 2.3.6, "Use High Availability Features for Infrastructure Repository to Synchronize within a Farm"
Section 2.3.7, "Follow dcmctl Tips to Improve Usage"

2.3.1 Use DCM Archiving to Take Snapshots of Configuration

You should frequently be creating archives prior to performing any configuration operations.

The DCM archive feature provides a convenient and easy means of managing snapshots of the DCM-managed portions of Oracle Application Server system configuration. Archives are useful for staging changes, recovering from errors, and to provision DCM managed configuration information associated with one Oracle Application Server instance to another.

DCM managed system configuration includes configuration for a farm, clusters, Oracle HTTP Server, OPMN, OC4J, and JAZN. For OC4J, in addition to configuration information related to the container itself, DCM manages all deployed J2EE applications.

If you use DCM-Managed Oracle Application Server Clusters, DCM assures that any change to the configuration is automatically distributed to all members of the cluster. As an alternative to using clusters, you can manually apply an archive of a staged configuration to non-clustered instances in a farm.

A hybrid staging solution is to first stage and test changes to a non-clustered instance, archive the changes, and finally apply the archive to DCM-Managed Oracle Application Server Cluster. These changes are then automatically propagated to all members of the cluster.

Implementation Details

For example, to create an archive prior to deploying a new J2EE application named foo use the command:

dcmctl createArchive -arch PriorToDeployingFoo -comment "prior to foo deploy V1"

When using createArchive, it is a good practice to use an archive name and a corresponding comment that identifies the version of configuration that the archive is associated with.

See Also:

Chapter 3, "Archiving A Managed Configuration," in the Distributed Configuration Management Administrator's Guide for a sample pre-start script

2.3.2 Specify a Single Instance in a Cluster as the Management Point to Provide A Correct Order of Operations

You can manage Oracle Application Server instances, grouped in a DCM-Managed Oracle Application Server Cluster, as a single point of administration, using Application Server Control Console or dcmctl on any instance in the cluster. Use one instance as the administrative point for the entire cluster at any point in time.

Specifying a single instance in a cluster as the management point ensures that operations are executed in the correct order and are properly serialized.

2.3.3 Avoid Concurrent Administration Operations to Prevent Configuration Conflicts

When changing instance specific configuration, for example port numbers, host names or virtual hosts, on a particular instance in the DCM-Managed OracleAS Cluster, you must ensure that there are no other administrative changes are being made concurrently in the cluster to avoid conflicting changes to configuration resulting in an unusable configuration.

Concurrent administration within a DCM-Managed OracleAS Cluster is strongly discouraged. If multiple administrative operations are issued at the same time in a cluster, this can lead to errors and associated confusing error messages. For example, a concurrent attempt to change the configuration of an instance being deleted really does not make sense.

2.3.4 Avoid Running updateConfig Concurrently with Any Other Configuration Operation to Prevent Configuration Conflicts

Do not run the dcmctl updateConfig command concurrently with any other dcmctl commands or Application Server Control Console configuration operations from multiple Oracle Application Server instances in a Farm or DCM-Managed OracleAS Cluster. If updateConfig is being executed concurrently with other configuration operation, there is a risk of conflicting changes being placed in the metadata repository. These conflicts could leave the configuration stored in the metadata repository in a non-functional state and could require a restore from the archive.

2.3.5 Restart Application Server Control Console after Joining or Leaving a Farm or Cluster to Refresh the Console

When using a file-based repository, you should stop and then start Application Server Control Console after issuing the following dcmctl commands:

joinCluster
joinFarm
leaveCluster
leaveFarm

Implementation Details

Use following commands to restart Application Server Control Console:

emctl stop iasconsole
emctl start iasconsole

2.3.6 Use High Availability Features for Infrastructure Repository to Synchronize within a Farm

The infrastructure repository houses all the configuration information for the Oracle Application Server instances in a farm. This information is critical during startup, since DCM ensures that the local configuration of any node is synchronized with the configuration in this central repository. Therefore, it is a good idea to employ the high availability features for the infrastructure instance.

However, it is also important to understand that the database-based repository (in the case of a J2EE and OracleAS Web Cache installation) is used for management operations and OracleAS Single Sign-On. Thus, if a site is not using single sign-on capabilities, then the repository is primarily required to be up when performing configuration management operations, such as deploying new applications and joining or moving from a DCM-Managed OracleAS Cluster.

2.3.7 Follow dcmctl Tips to Improve Usage

The following are best practices when using dcmctl:

Always use -d and -v options with dcmctl commands.

By default, the dcmctl script is configured for programmatic usage. Instead of displaying lengthy messages that can differ across releases and languages, error codes are displayed, such as ADMN-906005. Scripting tools can use these error codes to perform different activities based upon the result of commands.

Unfortunately a message like ADMN-906005 does not mean much by itself. In order to see an explanation of the error code, use the -d and -v switches whenever possible.
Use the dcmctl getError command to display the last error message

Use the dcmctl getError command to display the error message from the most recent DCM error that occurred
Always use dcmctl getreturnstatus to determine whether a command failed after timeout

Long-running operations will often timeout but continue to execute asynchronously. This issue is indicated by dcmctl with an ADMN-906005 error code:

Using the dcmctl deployApplication command with the -v option as an example, the following message will be displayed.

"The specified command "deployApplication", is being executed asynchronously. The maximum wait time of n seconds has been reached. This operation will continue to execute to completion. Use the "getReturnStatus" command to determine if/when the operation completes successfully."

Once this timeout message is received, you can invoke the dcmctl getReturnStatus command periodically until the operation has completed.
Use dcmctl shell mode for multiple commands.

When you need to perform a number of dcmctl commands, use the dcmctl shell or the dcmctl command file options. Each initialization of dcmctl requires creation of a Java Virtual Machine and the parsing of a number of XML documents. This initialization only has to occur once if using a dcmctl shell versus multiple times if executing a set of dcmctl commands individually.

Implementation Details

Following is a sample shell session in which the shell is started, commands are executed, and the shell is stopped.

% dcmctl shell
dcmctl> createcluster -cl testcluster
dcmctl> joincluster -cl testcluster
dcmctl> createcomponent -ct oc4j -co component1
dcmctl> deployapplication -f /stage/apps/app1.ear -a app1 -co component1
dcmctl> getstate
dcmctl> exit

See Also:

Appendix A, "dcmctl Commands," in the Distributed Configuration Management Administrator's Guide for a sample pre-start script

2.4 Dynamic Monitoring Services Best Practices

This section describes Dynamic Monitoring Services (DMS) best practices. It includes the following topics:

Section 2.4.1, "Monitor Your System Regularly to Identify Performance Problems"
Section 2.4.2, "Take Regular Dumps of Metrics to Capture and Save a Record of Performance Data"
Section 2.4.3, "Add Performance Instrumentation to Application to Aid Developers"
Section 2.4.4, "Isolate Expensive Intervals Using PhaseEvent Metrics to Validate Code"
Section 2.4.5, "Organize Performance Data to Avoid Metrics Not Displaying"
Section 2.4.6, "DMS Naming Conventions to Improve Metric Reports"
Section 2.4.7, "Follow DMS Coding Recommendations to Improve Code"
Section 2.4.8, "Validate New Metrics to Verify Accuracy"

2.4.1 Monitor Your System Regularly to Identify Performance Problems

It is a good practice to monitor Oracle Application Server regularly. Monitoring Oracle Application Server and obtaining performance data can assist you in tuning the system and debugging applications with performance problems.

Implementation Details

See Also:

Oracle Application Server Performance Guide for available monitoring tools

2.4.2 Take Regular Dumps of Metrics to Capture and Save a Record of Performance Data

Run the dmstool command with the -dump option periodically, such as every 15 to 20 minutes, to capture and save a record of performance data for your Oracle Application Server installation. If you save performance data over time, it can assist you if you need to analyze system behavior to improve performance or if problems occur. Using dmstool -dump reports all the available metrics on the standard output. The -dump option also supports the format=xml query. Using this query at the end of the command line supplies the metric output in XML format.

2.4.3 Add Performance Instrumentation to Application to Aid Developers

Consider instrumenting applications with DMS metrics. Adding performance instrumentation to Java applications will help developers, system administrators and support analysts understand system performance and monitor system status. DMS instrumentation refers to the process of inserting DMS calls into application code. Using the DMS API is a simple and efficient way to enable your application to measure, collect, and save performance information.

To create DMS metrics, developers add calls that notify DMS when events occur, when important intervals begin and end, or when pre-computed values change their state. At runtime, DMS stores performance information, called DMS metrics, in memory and enables you to save or view the metrics.

Implementation Details

See Also:

Oracle Application Server Performance Guide for available monitoring tools

2.4.4 Isolate Expensive Intervals Using PhaseEvent Metrics to Validate Code

Carefully consider the requirements for new metrics when you add DMS instrumentation. It is important to add a sufficient number of metrics to validate that your code is behaving as desired but not so much that the useful statistics become buried in too much detail. As a guide, try to observe the following rules when you add DMS metrics:

Add metrics only to provide an overview of the time the system spends in your block of code or module. You do not need to collect performance data for every method call, or for every distinct phase of your code or module.
When your code calls external code that you do not control, and that you expect could take a significant amount of time, add a PhaseEvent Sensor to track the start and the completion of the external code.

2.4.5 Organize Performance Data to Avoid Metrics Not Displaying

The DMS metrics are organized in a tree, with leaf nodes being Sensor metrics and branching nodes being Nouns. Define DMS Nouns to organize Sensors and their associated metrics. It is good practice to only use Noun types for Nouns that directly contain Sensors. When a Noun contains only Nouns, and does not directly contain Sensors, AggreSpy displays the Noun type as a metric table, with no metrics.

Maintain a static hierarchy for Noun types. A static hierarchy for Noun types means that some Noun types will always be ancestors of other Noun types. If it can be avoided, ensure a Noun does not have the same Noun type as any of its ancestors.

2.4.6 DMS Naming Conventions to Improve Metric Reports

Follow the guidelines for defining DMS names, which aids users viewing DMS metric reports to easily understand metrics across applications and across Oracle Application Server components. In applying the naming convention rules, try to be as clear as possible, if there is a conflict, you might need to make an exception.

In general, try to use only alphanumeric and underscore characters for naming and avoid using the forward slash (/) character.

Implementation Details

See Also:

Oracle Application Server Performance Guide for different naming conventions

2.4.7 Follow DMS Coding Recommendations to Improve Code

Use the following coding recommendations for working with DMS:

When you create a new Noun or Sensor (PhaseEvent, Event, or State), its full name must not conflict with names in use by Oracle built-in metrics, or by other applications.
Be sure all PhaseEvents are stopped. Put the PhaseEvents start() in a try block with the stop() in the finally block.
Avoid creating any DMS Sensor or Noun more than once. You should define Sensors and Nouns during static initialization, or in the case of a Servlet, in the init() method. Caching makes this less important as a best practice unless you are concerned about maximum performance.
Assign a Noun type for each Noun. Nouns with no Noun type specified are not shown in the Spy or AggreSpy display.
PhaseEvents should only be used to measure a section of code that is expensive under some set of conditions.
The DMS API calls are thread safe; they provide sufficient synchronization to prevent races and access bugs.
Avoid frequently creating and destroying Nouns and Sensors.

2.4.8 Validate New Metrics to Verify Accuracy

You should test and verify the accuracy of the metrics that you add to Java applications. Use the dmstool and the other available DMS monitoring tools to verify and test new metrics. Try the following to validate new metrics:

Do expected metrics appear in the display?
Do unexpected metrics appear in the display?

Verify that you have only added the metrics that you planned to add.
Are the metric values you see within reasonable ranges?

For example, a size of pool metric should never report a negative value.
Are metric values accurate?

This validation can be difficult to test. If an alternate means of measuring a particular metric is available, then use it to verify metric values. For example, you can verify an Event Sensor count metric by examining records that you write to a log file or to the console.
When integrating DMS instrumentation with an existing package or when implementing a new feature, consider insulating a previously working system.

For example, you could include an option to enable and disable new DMS metrics.