This chapter describes the design of the Portal Service on Application Server Cluster reference configuration, based on the functional and quality-of-service requirements that are specified in Business and Technical Requirements.
Read this chapter to understand the design rationale of the reference configuration before attempting to implement the deployment architecture in your own hardware environment.
The design of the reference configuration consists of a two-step process, first developing the logical architecture and then developing the deployment architecture, as described in the following sections:
This reference configuration architecture uses a web container provided by Sun JavaTM System Application Server. While the architecture would not change substantially if Sun Java System Web Server were used to provide the web container, the implementation procedures would be substantially different.
A logical architecture shows the software components (and the interactions between them) that are needed to provide a specific set of services to end users.
An analysis of the reference configuration's functional requirements and quality-of-service requirements (which specify the required performance, availability, scalability, security, and serviceability) is the basis for determining the main Java ES software components that are needed to meet these requirements. In most cases, these components interact with or are dependent upon other, secondary software components. For information about Java ES components, the services they provide, and interdependencies between those components, see the Sun Java Enterprise System 5 Update 1 Technical Overview.
The following sections describe the Java ES components that are used in the portal service reference configuration, their roles within the reference configuration, and the interactions between them:
The various components that are needed to meet the reference configuration requirements depend on their functions as distributed infrastructure services or their roles within a tiered application framework. In other words, the various components represent two views or dimensions that define a logical architecture: the logical tier dimension and the distributed infrastructure services dimension. These dimensions are described in the Sun Java Enterprise System 5 Update 1 Technical Overview.
The positioning of reference configuration components in such a two-dimensional framework is shown in the following logical architecture diagram. Components are placed within a horizontal dimension that represents standard logical tiers and within a vertical dimension that represents infrastructure service dependency levels. The positioning of a component in this matrix helps describe the role that the component plays in the logical architecture.
For example, Access Manager is a component that is used by presentation and business service tier components to provide security and policy infrastructure services. However, Application Server is a component that is used by presentation and business service tier components to provide distributed runtime services.
A description of the tiers shown in Figure 2–1 is provided in the following table.
Table 2–1 Logical Tiers in the Architecture Diagram
While Figure 2–1 is indicative of the role of the different components within the reference configuration's logical architecture, the following table describes more precisely the purpose of each component.
Table 2–2 Software Components in the Logical Architecture
To design a logical architecture, you must understand the software dependencies and interactions between the various components that are listed in Table 2–2. These interactions can be somewhat complicated and difficult to illustrate in a single diagram such as Figure 2–1. The main interactions between components in the reference configuration are therefore described briefly in the table below, in the context of typical portal service operations.
Two access scenarios are incorporated into the following table:
Direct access of portal services from a trusted network
Indirect access of portal services from an unsecured network (such as the public Internet) by way of SRA Gateway
Step |
What Happens |
---|---|
1 |
The user starts a browser and opens the portal service or SRA Gateway service URL, depending on the access scenario being used. |
2 |
If portal services are accessed directly, Portal Server returns the anonymous desktop, which includes the login channel. If portal services are accessed through the SRA Gateway, the Gateway redirects the user request to Access Manager. Access Manager returns the login page (by way of the Gateway). |
3 |
The user logs in by typing a user ID and password in the appropriate form and clicking Login. |
4 |
Access Manager interacts with Directory Server to retrieve the user's profile, which contains authentication, authorization, and application-specific information. |
5 |
Access Manager authenticates the user's ID and password against the LDAP directory information and creates a session object. |
6 |
When the user has been authenticated, Access Manager returns a session cookie to the user's browser and redirects the browser to Portal Server. |
Portal Server uses the session cookie to interact with Access Manager to access information in the user's profile (cached by Access Manager). Portal Server uses the information to build the user's personalized portal desktop. Portal Server returns the desktop to the user's browser (by way of the Gateway). |
|
7 |
The user reviews his or her portal desktop, and clicks a portal channel. |
8 |
Portal Server interacts with Access Manager to validate the status of the user session. Access Manager authorizes the channel content that is being requested by the user. |
9 |
When appropriate, Portal Server creates a portlet session and returns channel content to the user's browser. |
10 |
The user logs out or the session times out. |
11 |
Portal Server closes the portlet session, if any, and Access Manager deletes the user's session object. |
The understanding of component interactions represented in the logical architecture can be used later in the design process when you estimate the load on different components for sizing purposes and when you create a network connectivity specification.
A deployment architectureis a mapping of software components to a hardware environment. More specifically, for the Portal Service on Application Server Cluster reference configuration, it represents how to map the components in the logical architecture to networked computers in a way that achieves the specified quality-of-service requirements.
The logical architecture in Figure 2–1 identifies the components that are needed to meet the functional requirements of the reference configuration. The deployment architecture, however, shows how to do so with the specified quality of service.
The following sections present the deployment architecture diagram and discuss how the deployment architecture addresses the quality-of-service requirements:
The quality-of-service requirements for the portal service on Application Server Cluster reference configuration are summarized in the following table.
Table 2–4 Quality-of-Service Requirements for the Reference Configuration
Service Quality |
Requirements |
---|---|
Performance |
Response time under two seconds for default portal channels at peak load levels. |
Availability |
Service availability with both user session availability and application session state availability. No single points of failure. |
Security |
Protected services in separate network subnets. Firewall protection for Internet access and for portal service subnet zone. Encrypted Internet transport over SSL. |
Scalability |
Easily scalable so that no computer system is more than 80% utilized under daily peak load. Also the deployed system should accommodate long-term growth of 10% per year. |
Serviceability |
Minimize planned downtime needed to scale the portal service or to upgrade software components in the configuration. |
Figure 2–2 is a graphical representation of the deployment architecture for the Portal Service on Application Server Cluster reference configuration. It shows the following features of the deployment architecture:
The computers that are used to support the reference configuration and the components that are installed on each computer
The redundancy strategies that are used to achieve scalability and availability
The grouping of computers, components, and load balancers into service modules
The reference configuration deployment architecture is based on a Service Delivery Network Architecture (SDNA) approach, in which individual services within a solution are modularized (see https://wikis.sun.com/display/BluePrints/The+Service+Delivery+Network-+A+Case+Study). The result is a deployment architecture consisting of four independent service modules: SRA Gateway, portal, Access Manager, and directory.
In accordance with SDNA principles, each service module in the reference configuration independently implements its own level of availability, security, scalability, and serviceability. The overall solution can therefore be easily deployed, secured, maintained, and upgraded. An explanation of how the reference configuration's modular architecture facilitates quality-of-service objectives is provided in subsequent sections of this chapter.
The service modules that make up the reference configuration, shown in Figure 2–2, have the following common SDNA characteristics:
Each module consists of two or more instances of the service configured to meet quality-of-service requirements. Each module includes the components that are needed to provide a single service to the overall reference configuration.
For example, the two Directory Server instances in Figure 2–2 can be considered a unit that provides directory services for the other components in the deployment.
Each service module is accessed through a load balancer. The load balancer is configured to establish a virtual IP address, or virtual service address, for the module. Other components in the reference configuration are configured to send requests to the virtual service address rather than to the individual component instances.
For example, when Access Manager needs information from the directory service, it addresses its request to the virtual directory service address that is provided by the load balancer, rather than to a specific instance of Directory Server.
Each load balancer is responsible for routing incoming requests to a specific service instance, and, where desired, to route successive requests from the same user to the same service instance.
The component instances in a module can be reconfigured without changing the virtual service address of the module, making changes within the module transparent to other components. Depending on the kinds of usage patterns you experience, you can independently assign system resources and scale each module to meet load requirements, using scaling techniques that are best suited to the module.
While the modular architecture depicted in Figure 2–2 has many advantages, as described in subsequent sections of this chapter, alternative approaches in common practice do exist. The drawbacks of two such alternatives, which are not supported by this reference configuration, are discussed below.
In some situations, the modular architecture of Figure 2–2 might result in lower resource utilization than could be achieved by combining components on the same computer and running them in the same web container. In fact, many deployment architects have traditionally deployed Portal Server and Access Manager in the same web container in an effort to maximize resource utilization and reduce network traffic in updating Access Manager session information. However, such designs cannot realize the availability, security, scalability, and serviceability benefits of SDNA modularity, which generally outweigh the drawbacks.
Access Manager supports, by way of post-installation configuration, multiple LDAP directories for each Access Manger service. In this way, Access Manager can detect failure of a primary Directory Server instance and fail over to an standby instance. This built-in mechanism has several drawbacks:
Configuration of these multiple Directory Server instances needs to be done for each of the Access Manger services: user profiles, policies, LDAP authentication, Membership authentication, and so forth.
Access Manager does not load balance directory requests: only the primary DS instance is used, while the other(s) are inactive.
Upon a failure of a primary instance, Access Manager switches over to the standby instance, but if the primary instance comes back online, there is no mechanism to revert back to the original configuration.
By contrast, the modular architecture of Fig 2-2 has the following advantages:
The only required Access Manager configuration is the load balancer's virtual service address, specified at installation time.
The directory services load balancer In the reference configuration routes requests to all Directory Server instances, monitors the health of these instances, automatically performs the failover and restoration of a failed instance.
The modular architecture allows you to configure, manage, scale and monitor the Directory Server instances independent of the Access Manager instances.
In the multimaster replication approach of Figure 2-2, write operations are synchromized between directory instances. In environments with many write operations, the overhead of the multimaster replication process can slow down Directory Server processing of client requests. In these situations, the best approach is to direct all write operations to a single master by placing a Directory Proxy Server instance in front of each Directory Server instance. Such situations are not common in portal service deployments, so the reference configuration does not include Directory Proxy Server.
The deployment architecture that is represented in Figure 2–2 uses several strategies to meet the availability requirements of the reference configuration. Availability requirements fall into the two categories that are discussed in the following sections:
Service availability means that a service is available, even when a service provider fails. Service availability is generally achieved using multiple identically configured service instances (redundancy). Redundancy eliminates single points of failure (assuming that simultaneous failure of all instances is extremely unlikely). If one instance providing a service fails, another instance is available to take over. This mechanism is known as service failover.
Service failover is supported in the reference configuration through two mechanisms:
Load balancing. Load balancing uses redundant hardware and software components to distribute requests for a service among multiple component instances that provide the service. This redundancy provides greater capacity than would be possible with a single instance. This redundancy also means that if any one instance of a components fails, other instances are available to assume a heavier load. Depending on the latent capacity that is built into the deployment, a failure might not result in significant degradation of performance. Load balancing is used in all of the service modules in the reference configuration.
Directory Server multimaster replication. The preferred solution for Directory Server, this mechanism provides data that is crucial to the operation of the entire deployment. Multimaster replication is specifically designed to synchronize data between the two (or more) Directory Server instances shown in the deployment architecture. Multimaster replication is the simplest directory service failover implementation and is suitable for all but the highest-end deployments that need to support millions of users.
Session state availability means that data associated with a user session is not lost during a service failover. When a service failover occurs, the session state data that is stored by the failed instance is made available to the failover instance. This mechanism is known as session failover. The result is that the service failover is transparent to the user: the user will not be required to log in again or to restart a business operation.
Session failover is supported in the reference configuration through two mechanisms:
Access Manager session failover. Access Manager session information is created when a user is authenticated and stored in a replicated database. This database is shared by Access Manager instances and accessed through Message Queue. If an Access Manager instance fails, the load balancer routes all user requests to a failover instance (service failover). The failover instance retrieves session information from the shared database and maintains the session.
Portlet session failover. The JSR 168 portlet specification requires portlets to map state information to an HTTP session. If a web container supports highly available HTTP sessions, and if a Portal Server instance fails, the HTTP session state can be recovered by the failover instance. In the reference configuration, Portal Server is deployed in an Application Server cluster, in which High Availability Session Store (HADB) is used to store and replicate portlet session state. The failover instance retrieves session information from HADB and maintains the session.
Portlet session failover requires availability of Access Manager session state. An Access Manager failure could therefore interfere with portlet session failover, unless Access Manager session failover is also implemented.
When a user is successfully authenticated with Access Manager, the browser is redirected to a Portal Server instance. A portal desktop session is created on this instance and is mapped to the user's Access Manager session. This portal desktop session is used to track Portal Server specific information such as the user's merged display profile and provider properties. If a Portal Server instance fails, the desktop session is automatically re-created by using the user's display profile and attributes that are stored in the Access Manager's user session. However, provider properties that are stored in local memory are lost.
The security requirements of the portal service reference configuration (see Security Requirements) are met through several mechanisms, each of which are discussed in the following sections:
Each user's access must be limited to the portal services and data channels that he or she is authorized to view.
The reference configuration uses Access Manager and Directory Server to control user access to portal content. The directory service maintains each user's portal desktop profile. This profile includes any desktop customization that is performed by the user, as well as mechanisms for determining what content the user is authorized to view.
The modularized architecture makes it easy for different organizations to administer different service modules so that each organization has the level of administrative security it needs. In most enterprises, for example, directory services and Access Manager services are administered by security-oriented organizations, while portal services are administered by end-user applications organizations.
The portal service must be secured against unauthorized and unauthenticated access.
The deployment architecture uses a secure network topology for the portal service, which includes the use of firewalls, controlled access through load balancers with virtual service addresses, and private subnets behind the firewall.
Figure 2–2 shows a portal services zone in which the portal service, Access Manager service, and directory service modules are deployed behind the Internal Firewall. Within this zone, the deployment architecture protects the service modules in the following ways:
A load balancer provides a single point of contact for the portal service, even though the service consists of two Portal Server instances that are running on two computers. This means that there is only one opening in the firewall for the portal service, and all of the traffic for the portal service is routed through the load balancer. Note that employees connected to the main corporate network also access the portal through this load balancer.
Local access to the portal service is only from trusted computers on the corporate network, by users who have authenticated themselves to the corporate network.
Not shown in Figure 2–2, but implied in the deployment architecture, is a network topology that creates separate subnets for accessing each service module. The IP addresses that are used in the subnets are private IP addresses, making the subnets invisible to the outside world. These subnets are connected only through the load balancers, further impeding the ability of intruders to access the actual computers behind the public URL. For more information on the network topology, see Network Connectivity Specification.
Not shown in Figure 2–2 is that the individual computers hosting service instances are hardened and that the operating system installations are minimized. Minimizing the number of installed Solaris OS packages means fewer security holes. Because the majority of system penetrations are through exploitation of operating system vulnerabilities, minimizing the number of installed operating system packages will reduce the number of vulnerabilities. Minimizing the operating system is covered in detail in Computer Hardware and Operating System Specification.
The secure remote access option provides secure access to portal services, applications, and other content on an internal intranet to employees or customers on the public Internet. This option prevents such access to unauthorized people.
The requirement for secure remote access is met in the Portal Service on Application Server Cluster reference configuration through Portal Server SRA components, specifically the SRA Gateway service, and by network access zones, demarcated by firewalls, that take maximum advantage of the SRA Gateway service. The access zones and the firewalls are represented in Figure 2–2.
The outermost zone in Figure 2–2 is the so-called demilitarized zone, or DMZ, which contains the SRA Gateway service. The Gateway service can only be accessed through the External Firewall at one specific URL. Employees or customers who connect to the portal service with remote browser clients or mobile clients do so by accessing the Gateway service at the specified URL. The External Firewall blocks all other ports and addresses.
Because remote access to the portal service from the public Internet is through the Gateway service, the portal service itself can reside behind an additional firewall (the Internal Firewall) and an additional layer of hardware load balancing.
In addition to deploying the Gateway service behind an Internet-facing firewall, the deployment architecture secures the Gateway service in the following ways:
The Gateway service requires the authentication of all users. Users who access the URL for the Gateway service in their browsers are presented with a login page and must type a user ID and password to gain access to any content.
The Gateway service instances are behind a hardware load balancer. The load balancer provides a single point of contact for the Gateway service, even though multiple component instances are running on multiple computers. As a result, only one port in the firewall is needed for the Gateway service, and all requests are routed through the load balancer.
The communication between the browser and the Gateway service load balancer is encrypted through using the SSL protocol. This protocol is required because this traffic will circulate through an unsecured network (the Internet). The SSL protocol also requires the use of server certificates to ensure that service providers have not been tampered with. Optionally, client certificates can be used to better authenticate access to the Gateway service.
The modular nature of the reference configuration's deployment architecture means that you can scale each module independently, depending on the kind of traffic that your portal service receives.
Each service module in the deployment architecture is composed of two or more service instances running on separate computers behind a load balancer. This architecture allows you to scale any of the modules vertically (by adding CPUs or memory to the host computers) or horizontally (by adding additional service instances). Some modules are better suited to vertical scaling, and some modules are better suited to horizontal scaling.
The recommended techniques for scaling each module in the reference configuration are as follows:
Scaling the directory service module:
Directory Server scales almost linearly up to 12 CPUs, so vertical scaling is an effective technique for this module.
A limitation on the performance of Directory Server is the complexity of the LDAP directory tree. Access Manager creates access control instructions (ACIs) for each Access Manager organization. Creating multiple organizations increases the load on Directory Server, as it must process more requests from Access Manager. At some point (at about 1000 organizations), vertical scaling is no longer effective.
In that case it becomes more effective to scale horizontally, keeping the multiple Directory Server instances synchronized by using the Directory Server's multimaster replication feature. Other approaches include trimming down the number of ACIs created for each organization and running Access Manager in realm mode instead of legacy mode. Having thousands of organizations is not a common requirement, so the reference configuration does not explore the architectural implications of large numbers of Access Manager organizations.
Scaling the portal and Access Manager service modules:
These modules can be scaled effectively by adding computers running additional component instances to the module. This approach is cost-effective and also helps maintain availability because spreading the load over additional computers ensures that only a relatively smaller amount of capacity will be lost if a single hardware system fails.
Both Portal Server and Access Manager run in web containers. When they run in a 32–bit web container, as described in this reference configuration, the maximum process size is 4 Gbytes of memory, limiting the number of user session objects that can be stored. If increased memory is needed or increased throughput is desired, these modules should be scaled horizontally.
It might seem that to better utilize memory (the computers used in the reference configuration have 16 Gbytes of memory), it would be possible to run multiple instances on the same hardware. However, this kind of vertical scaling breaks the modularity of the architecture and does not substantially increase throughput (the number of pages that are rendered per second).
Scaling the gateway service module:
The Gateway service can be scaled effectively by adding computers running additional component instances to the module. This approach is cost effective and also helps maintain availability because spreading the load over additional computers ensures that only a relatively smaller amount of capacity will be lost if a single hardware system fails.
The reference configuration architecture builds the portal service out of several subservices, such as the Access Manager service and directory service. Because each subservice is implemented in a separate module, it is possible to maintain each module independently.
In addition, the reference configuration architecture creates each subservice as a virtual service, which means that interoperability among the subservices is not dependent on specific hardware connections, and the individual subservices are maintained, upgraded, replaced, and scaled without affecting each other. For example, if it is necessary to add an Access Manager instance to the architecture, the Portal Server instances that depend on Access Manager do not need to be modified or affected in any way.
Depending on your quality-of-service requirements, certain parts of the reference configuration can be changed or omitted. This section briefly discusses these options which include the following:
The reference configuration deployment architecture supports portlet session failover, as described in Session State Availability. It does this by deploying Portal Server in an Application Server cluster that uses High Availability Session Store (HADB) to store and replicate portlet session state.
If your business solution does not involve portlets that store session state, then portlet session failover might not be a requirement for your portal service deployment. If that is the case, you do not need to deploy Portal Server in an Application Server cluster. However, if you have other reasons beside portlet session failover for deploying Portal Server in an Application Server cluster, you can use this guide, but omit the section on implementing portlet session failover.
Portal Server can be deployed in a web container provided by nonclustered Application Server instances. This approach would substantially change the implementation of the portal service module described in Chapter 6, Implementation Module 3: Portal Server With Portlet Session Failover on Application Server Cluster.
At the present time, however, an alternative implementation for Portal Server on Application Server (without portlet session failover) has not yet been documented.
The reference configuration deployment architecture supports Access Manager session failover, as described in Session State Availability. It does so by configuring Access Manager to use Message Queue and a highly available database to store and replicate Access Manager session state.
If your business solution permits users to log in again to reestablish a session after a service failover, then Access Manager session failover is not a requirement for your portal service deployment. If that is the case, you do not need to configure Access Manager for Access Manager session failover, and Message Queue would not be included as a component in the Access Manager service module. This approach would change the implementation of the Access Manager service module by not requiring the procedures in Implementing Session Failover for Access Manager .
The reference configuration deployment architecture supports secure access to portal services, applications, and other content on an internal intranet to users on the public Internet. This feature is supported by the SRA Gateway module, as described in Secure Remote Access.
If your business solution does not require secure access to portal services, applications, and other content over the public Internet, then secure remote access is not a security requirement for your portal service deployment. For example, you might be using one of the following alternate scenarios to access the portal service:
An Internet-accessible portal service that communicates over SSL and is deployed in a DMZ
A portal service that is located behind an organization's firewalls and accessed only locally or through VPN connections
An internal portal service that is only accessed on a corporate network
In these scenarios, you can omit Chapter 7, Implementation Module 4: Secure Remote Access Gateway from the reference configuration architecture. However, depending on the scenario, you might need to modify the network topology of the reference configuration accordingly.
Two of the components in the reference configuration, Portal Server and Access Manager, run in web containers. The Java ES component set gives you the choice of using either Sun Java System Web Server or Sun Java System Application Server for a web container.
You need to consider both technical and non-technical factors when you choose a web container.
The following technical factors address the abilities of the different containers to run different types of portal content:
Portlets and providers are Portal Server mechanisms for building presentation channels that can aggregate content from other applications. If your plans include developing portlets or providers that use Java EE APIs that are not supported by Web Server, such as the Enterprise JavaBeans (EJB) or Java Connector Architecture (JCA) interfaces, then you must use Application Server as your web container.
Web Server 7.0 supports a lightweight mechanism for HTTP session failover. This mechanism can be eventually used to enable portlet session failover in the same way that HADB and Application Server clusters enable such failover. However, this new feature of Web Server and its impact on the reliability, security, and performance has not yet been fully analyzed.
A reference configuration guide that documents a portal service deployment on Web Server does not yet exist.
If none of the technical factors are decisive for your organization, the following non-technical considerations could prove decisive:
Does your organization have existing standards for a web container? If so, you are likely to use that web container to implement the portal service reference configuration.
What does a price-to-performance comparison of the web containers reveal? Your organization might choose a web container based on the cost of the licenses that are needed to support the organization's user base. Your organization might have a volume discount agreement with a vendor that affects this decision.
Your organization might have support agreements with a web container vendor.
You might want to choose the same web container for all elements of your portal service even if you are not colocating Portal Server instances and portal channel applications. For example, if you have portal channels that are running in Application Server, you might want to deploy Portal Server in Application Server for the sake of consistency.
If there is no compelling reason to use Application Server in your portal, Web Server can be easier to administer.