4 Oracle Fusion Middleware Administration and Operations

This chapter describes Oracle Fusion Middleware administration. It includes the following sections:

Section 4.1, "Enterprise Deployment"
Section 4.2, "Management"
Section 4.3, "Scalability"
Section 4.4, "High Availability"
Section 4.5, "Load Balancing"
Section 4.6, "Diagnostic Data"
Section 4.7, "Cloning"
Section 4.8, "Oracle RAC"
Section 4.9, "Backup and Recovery"

4.1 Enterprise Deployment

An enterprise deployment is an Oracle best practices blueprint based on proven Oracle high-availability technologies and recommendations for Oracle Fusion Middleware. The high-availability best practices include all Oracle products across the entire technology stack—Oracle Database, Oracle Fusion Middleware, Oracle Applications, Oracle Collaboration Suite, and Oracle Grid Control.

An Oracle Fusion Middleware enterprise deployment:

Considers various business service level agreements (SLA) to make high-availability best practices as widely applicable as possible
Takes advantage of database grid servers and storage grids with low-cost storage to provide highly resilient, lower cost infrastructure
Uses results from extensive performance impact studies for different configurations to ensure that the high-availability architecture is optimally configured to perform and scale to business needs
Enables control over the length of time to recover from an outage and the amount of acceptable data loss from a natural disaster
Evolves with each Oracle release and is completely independent of hardware and operating system

For more information on enterprise deployment see the following guides:

For more information on high-availability practices, visit:

http://www.oracle.com/technology/deploy/availability/htdocs/maa.htm

4.2 Management

You manage your Oracle Fusion Middleware environment using one of the following management tools:

Fusion Middleware Control
Oracle WebLogic Server Administration Console
Oracle WebLogic Scripting Tool
Oracle Process Manager and Notification Server
Oracle Enterprise Manager Grid Control

4.2.1 Fusion Middleware Control

Oracle Enterprise Manager Fusion Middleware Control (Fusion Middleware Control) is a Web browser-based, graphical user interface that you can use to monitor and administer a farm or cluster.

A farm is a collection of components managed by Fusion Middleware Control. It can contain Oracle WebLogic Server domains, one Administration Server, one or more Managed Servers, clusters, and the Oracle Fusion Middleware components that are installed, configured, and running in the domain.

Fusion Middleware Control organizes a wide variety of performance data and administrative functions into distinct, Web-based home pages for the farm, cluster, domain, servers, components, and applications. The Fusion Middleware Control home pages make it easy to locate the most important monitoring data and the most commonly used administrative functions all from your Web browser.

Fusion Middleware Control provides direct access to Oracle WebLogic Server Administration Console. The Web pages in the Fusion Middleware Control interface contain links that take enable you to access the Administration Console. For example, on the Domain home page, if you go to the Summary area there is a link that takes you to the Administration Console in order to configure and manage the domain.

For more information, see "Getting Started Using Oracle Enterprise Manager Fusion Middleware Control" in the Oracle Fusion Middleware Administrator's Guide.

4.2.2 Oracle WebLogic Server Administration Console

Oracle WebLogic Server Administration Console is a Web browser-based, graphical user interface that you use to manage an Oracle WebLogic Server domain. It is accessible from any supported Web browser with network access to the Administration Server.

Use the Administration Console to:

Configure, start, and stop Oracle WebLogic Server instances
Configure Oracle WebLogic Server clusters
Configure Oracle WebLogic Server services, such as database connectivity (JDBC) and JMS messaging
Configure security parameters, including creating and managing users, groups, and roles
Configure and deploy Java EE applications
Monitor server and application performance
View server and domain log files
View application deployment descriptors
Edit selected run time application deployment descriptor elements

For more information, see "Getting Started Using Oracle WebLogic Server Administration Console" in the Oracle Fusion Middleware Administrator's Guide.

4.2.3 Oracle WebLogic Scripting Tool

The Oracle WebLogic Scripting Tool (WLST) is a command-line scripting environment that you can use to create, manage, and monitor Oracle WebLogic Server domains. It is based on the Java scripting interpreter, Jython. In addition to supporting standard Jython features such as local variables, conditional variables, and flow control statements, WLST provides a set of scripting functions (commands) that are specific to Oracle WebLogic Server. You can extend the WebLogic scripting language to suit your needs by following the Jython language syntax.

You can use any of the following techniques to invoke WLST commands:

Interactively, on the command line
In script mode, supplied in a file
Embedded in Java code

For more information, see "Getting Started Using Oracle WebLogic Server Scripting Tool (WLST)" in the Oracle Fusion Middleware Administrator's Guide.

For more information about the WebLogic scripting tool see the Oracle Fusion Middleware WebLogic Scripting Tool Command Reference .

4.2.4 Oracle Process Manager and Notification Server

Oracle Process Manager and Notification Server (OPMN) manages and monitors the following Oracle Fusion Middleware components, referred to as system components:

Oracle HTTP Server
Oracle Web Cache
Oracle Internet Directory
Oracle Virtual Directory
Oracle Forms Services
Oracle Reports
Oracle Business Intelligence Discoverer
Oracle Business Intelligence

For more information, see "Getting Started Using Oracle Process Manager and Notification Server" in the Oracle Fusion Middleware Administrator's Guide.

4.2.5 Oracle Enterprise Manager Grid Control

Oracle Enterprise Manager Grid Control is a Web browser-based, graphical user interface that you can use to monitor multiple Oracle Fusion Middleware Farms and Oracle WebLogic Server Domains. In fact, Grid Control provides deep management solutions for Oracle technologies including Oracle packaged applications, Oracle Database and Oracle VM. Grid Control also offers extensive support for non-Oracle technologies through more than two dozen heterogeneous management plug-ins and connectors including Microsoft MOM, IBM WebSphere, JBoss, EMC storage, F5 BIG IP, Check Point Firewall, and Remedy.

Beyond managing your entire data center from a single interface, Grid Control offers critical features that help you manage Oracle Fusion Middleware and Oracle WebLogic Server more effectively and efficiently. Such additional management capabilities include:

Analyze and report on trends based upon collected availability and performance data.
Receive alert notifications (via email, page, SNMP) for metrics which have crossed thresholds previously defined by you.
Automate common administrative operations (e.g. start/stop, WLST scripts).
Resolve problems faster through visibility into all Java activity - including in-flight transactions - and tracing transactions from Java to Database and vice-versa.
Detect, validate, and report authorized and unauthorized configuration changes in real time.
Ensure configuration consistency across development and production environments.

For more information about Oracle Enterprise Manager Grid Control refer to the Oracle Enterprise Manager Grid Control Concepts Guide available on OTN.

4.3 Scalability

Scalability is the ability of a system to provide throughput in proportion to, and limited only by, available hardware resources. A scalable system is one that can handle increasing numbers of requests without adversely affecting response time and throughput.

The growth of computational power within one operating environment is called vertical scaling. Horizontal scaling is leveraging multiple systems to work together on a common problem, in parallel.

Oracle Fusion Middleware scales both vertically and horizontally. Horizontally, Oracle Fusion Middleware can increase its throughput with several Managed Servers grouped together to share a workload. Also, Oracle Fusion Middleware provides great vertical scalability, allowing you to add more Managed Servers or components in the same node.

High availability refers to the ability of users to access a system. Deploying a high-availability system minimizes the time when the system is down, or unavailable and maximizes the time when it is running, or available. Oracle Fusion Middleware is designed to provide a wide variety of high-availability solutions, ranging from load balancing and basic clustering to providing maximum system availability during catastrophic hardware and software failures.

High-availability solutions can be divided into two basic categories: local high availability and disaster recover.

For more information about high availability see the Oracle Fusion Middleware High Availability Guide.

4.4 High Availability

A high availability architecture is one of the key requirements for any Enterprise Deployment. Oracle Fusion Middleware has an extensive set of high availability features, which protect its components and applications from unplanned down time and minimize planned downtime to achieve your business goals.

High availability refers to the ability of users to access a system without loss of service. Deploying a high availability system minimizes the time when the system is down, or unavailable and maximizes the time when it is running, or available. This section provides an overview of high availability from a problem-solution perspective. This section includes the following topics:

High Availability Problems
High Availability Solutions
Disaster Recovery

4.4.1 High Availability Problems

Mission critical computer systems need to be available 24 hours a day, 7 days a week, and 365 days a year. However, part or all of the system may be down during planned or unplanned downtime. A system's availability is measured by the percentage of time that it is providing service in the total time since it is deployed. Table 4-1 provides an example.

Table 4-1 Availability Percentages and Corresponding Downtime Values

Availability Percentage	Approximate Downtime Per Year
95%	18 days
99%	4 days
99.9%	9 hours
99.99%	1 hour
99.999%	5 minutes

System downtime may be categorized as planned or unplanned. Unplanned downtime is any sort of unexpected failure. Planned downtime refers to scheduled operations that are known in advance and that render the system unavailable. The effect of planned downtime on end users is typically minimized by scheduling operational windows when system traffic is slow. Unplanned downtime may have a larger effect because it can happen at peak hours, causing a greater impact on system users.

These two types of downtimes (planned and unplanned) are usually considered separately when designing a system's availability requirements. A system's needs may be very restrictive regarding its unplanned downtimes, but very flexible for planned downtimes. This is the typical case for applications with high peak loads during working hours, but that remain practically inactive at night and during weekends. You may choose different high availability features depending on the type of failure is being addressed.

4.4.2 High Availability Solutions

High availability solutions can be categorized into local high availability solutions that provide high availability in a single data center deployment, and disaster recovery solutions, which are usually geographically distributed deployments that protect your applications from disasters such as floods or regional network outages.

Amongst possible types of failures, process, node, and media failures as well as human errors can be protected by local high availability solutions. Local physical disasters that affect an entire data center can be protected by geographically distributed disaster recovery solutions.

To solve the high availability problem, a number of technologies and best practices are needed. The most important mechanism is redundancy. High availability comes from redundant systems and components. You can categorize local high availability solutions by their level of redundancy, into active-active solutions and active-passive solutions (see Figure 4-1):

Active-active solutions deploy two or more active system instances and can be used to improve scalability as well as provide high availability. In active-active deployments, all instances handle requests concurrently.
Active-passive solutions deploy an active instance that handles requests and a passive instance that is on standby. In addition, a heartbeat mechanism is set up between these two instances. This mechanism is provided and managed through operating system vendor-specific clusterware. Generally, vendor-specific cluster agents are also available to automatically monitor and failover between cluster nodes, so that when the active instance fails, an agent shuts down the active instance completely, brings up the passive instance, and application services can successfully resume processing. As a result, the active-passive roles are now switched. The same procedure can be done manually for planned or unplanned downtime. Active-passive solutions are also generally referred to as cold failover clusters.

You can use Oracle Cluster Ready Services (CRS) to manage the Fusion Middleware Active-Passive (CFC) solutions.

Figure 4-1 Active-Active and Active-Passive High Availability Solutions

Description of "Figure 4-1 Active-Active and Active-Passive High Availability Solutions"

In addition to architectural redundancies, the following local high availability technologies are also necessary in a comprehensive high availability system:

Process death detection and automatic restart

Processes may die unexpectedly due to configuration or software problems. A proper process monitoring and restart system should monitor all system processes constantly and restart them should problems appear.

A system process should also maintain the number of restarts within a specified time interval. This is also important since continually restarting within short time periods may lead to additional faults or failures. Therefore a maximum number of restarts or retries within a specified time interval should also be designed as well.
Clustering

Clustering components of a system together allows the components to be viewed functionally as a single entity from the perspective of a client for run time processing and manageability. A cluster is a set of processes running on single or multiple computers that share the same workload. There is a close correlation between clustering and redundancy. A cluster provides redundancy for a system.

If failover occurs during a transaction in a clustered environment, the session data is retained as long as there is at least one surviving instance available in the cluster.
State replication and routing

For stateful applications, client state can be replicated to enable stateful failover of requests in the event that processes servicing these requests fail.
Failover

With a load-balancing mechanism in place, the instances are redundant. If any of the instances fail, requests to the failed instance can be sent to the surviving instances.
Server load balancing

When multiple instances of identical server components are available, client requests to these components can be load balanced to ensure that the instances have roughly the same workload.
Server Migration

Some services can only have one instance running at any given point of time. If the active instance becomes unavailable, the service is automatically started on a different cluster member. Alternatively, the whole server process can be automatically started on a different machine in the cluster.
Integrated High Availability

Components depend on other components to provide services. The component should be able to recover from dependent component failures without any service interruption.
Rolling Patching

Patching product binaries often requires down time. Patching a running cluster in a rolling fashion can avoid downtime. Patches can be uninstalled in a rolling fashion as well.
Configuration management

A clustered group of similar components often need to share common configuration. Proper configuration management ensures that components provide the same reply to the same incoming request, allows these components to synchronize their configurations, and provides high availability configuration management for less administration downtime.
Backup and Recovery

User errors may cause a system to malfunction. In certain circumstances, a component or system failure may not be repairable. A backup and recovery facility should be available to back up the system at certain intervals and restore a backup when a failure occurs.

4.4.3 Disaster Recovery

Disaster recovery solutions typically set up two homogeneous sites, one active and one passive. Each site is a self-contained system. The active site is generally called the production site, and the passive site is called the standby site. During normal operation, the production site services requests; in the event of a site failover or switchover, the standby site takes over the production role and all requests are routed to that site. To maintain the standby site for failover, not only must the standby site contain homogeneous installations and applications, data and configurations must also be synchronized constantly from the production site to the standby site.

Figure 4-2 Geographically Distributed Disaster Recovery

Description of "Figure 4-2 Geographically Distributed Disaster Recovery"

4.4.3.1 Components Protected by High Availability Solutions

The Oracle Fusion Middleware High Availability Guide discusses high availability solutions for the following components:

Oracle WebLogic Server
Oracle SOA Suite
Oracle ADF
Oracle WebCenter Portal and Oracle WebCenter Content
Oracle Identity Management Components
Oracle HTTP Server
Oracle Web Cache
Oracle Portal, Forms, Reports, and Discoverer

4.5 Load Balancing

Origin server load balancing is a feature in which HTTP requests are distributed among origin servers so that no single origin server is overloaded.

Oracle Web Cache supports load balancing and failover detection for application Web servers.

Oracle Web Cache ensures that cache misses are directed to the most available, highest-performing Web server in the server farm. A capacity heuristic guarantees performance and provides surge protection when the application Web server load increases.

For more information about load balancing and failover, see Oracle Fusion Middleware Administrator's Guide for Oracle Web Cache.

4.6 Diagnostic Data

Oracle Fusion Middleware components generate log files containing messages that record all types of events, including startup and shutdown information, errors, warning messages, and access information on HTTP requests. This chapter describes how to find information about the cause of an error and its corrective action, to view and manage log files to assist in monitoring system activity and to diagnose problems.

For more information see the Oracle Fusion Middleware Administrator's Guide.

4.7 Cloning

Cloning is the process of copying an existing entity to a different location while preserving its state. Some situations in which cloning Oracle Fusion Middleware is useful are:

Creating a Middleware home or Oracle home that is a copy of a production, test, or development environment. Cloning enables you to create a new Middleware home or an Oracle home with all patches applied to it in a single step. This is in contrast to separately installing, configuring and applying any patches to separate Oracle homes.
Preparing a "gold" image of a patched home and deploying it to many hosts.

The cloned entity behaves the same as the source entity. For example, a cloned Oracle home can be deinstalled or patched using the installer. It can also be used as the source for another cloning operation.

For more information about cloning see Oracle Fusion Middleware Administrator's Guide.

4.8 Oracle RAC

Oracle Real Application Clusters (Oracle RAC) is a computing environment that harnesses the processing power of multiple, interconnected computers. Along with a collection of hardware, called a cluster, it unites the processing power of each component to become a single, robust computing environment. A cluster comprises two or more computers, also called nodes. Oracle Real Application Clusters simultaneously provides a highly scalable and highly available database for Oracle Fusion Middleware.

Every Oracle RAC instance in the cluster has equal access and authority, therefore, node and instance failure may affect performance, but does not result in downtime, since the database service is available or can be made available on surviving server instances.

For more information on Oracle Real Application Clusters, see the Oracle Fusion Middleware High Availability Guide.

4.9 Backup and Recovery

An Oracle Fusion Middleware environment can consist of different components and configurations. A typical Oracle Fusion Middleware environment contains an Oracle WebLogic Server domain with Java component such as Oracle SOA Suite and an Oracle WebLogic Server domain with Oracle Identity Management components. It can also include one or more Oracle instances.

The installations of an Oracle Fusion Middleware environment are interdependent in that they contain configuration information, applications, and data that are kept in synchronization. For example, when you perform a configuration change, you might update configuration files in the installation. When you deploy an application, you might deploy it to all Managed Servers in a domain or cluster.

It is important to consider your entire Oracle Fusion Middleware environment when performing backup and recovery. You should back up your entire Oracle Fusion Middleware environment once, then periodically thereafter perform incremental backup and recovery operations. If a loss occurs, you can restore your environment to a consistent state.

For information about backup and recovery, see "Advanced Administration: Backup and Recovery" in the Oracle Fusion Middleware Administrator's Guide.