Sun Java System Portal Server 7.1 Deployment Planning Guide

Chapter 5 Planning Your Deployment Design

During the deployment design phase of the solution life cycle, you design a high-level deployment architecture and a low-level implementation specification, and prepare a series of plans and specifications necessary to implement the solution. Project approval occurs in the deployment design phase.

This chapter contains the following sections:

About Deployment Design

Deployment design begins with the deployment scenario created during the logical design and technical requirements phases of the solution life cycle. The deployment scenario contains a logical architecture and the quality of service requirements for the solution. You map the components identified in the logical architecture across physical servers and other network devices to create a deployment architecture. The quality of service requirements provide guidance on hardware configurations for performance, availability, scalability, and other related quality of service specifications.

Designing the deployment architecture is an iterative process. You typically revisit the quality of service requirements and reexamine your preliminary designs. You take into account the interrelationship of the quality of service requirements, balancing the trade-offs and cost of ownership issues to arrive at an optimal solution that ultimately satisfies the business goals of the project.

The best way to start deployment planning is to begin with a reference or building block architecture which is already well documented and tested. It is easier to modify this into what's required rather than starting from scratch. Factors that contribute to successful deployment design are past design experience, knowledge of systems architecture, domain knowledge, and applied creative thinking.

Deployment design typically revolves around achieving performance requirements while meeting other quality of service requirements. The strategies you use must balance the trade-offs of your design decisions to optimize the solution. The methodology you use typically involves the following tasks:

Estimate processor requirements. Deployment design often begins with portal sizing in the logical architecture. Start with the use cases and modify your estimates accordingly. Also consider any previous experience you have with designing enterprise systems.
Estimate processor requirements for secure transport. Study use cases that require secure transport and modify CPU estimates accordingly.
Replicate services for availability and scalability. Make modifications to the design to account for quality of service requirements for availability and scalability. Consider load balancing solutions that address availability and failover considerations.
During your analysis, consider the trade-offs of your design decisions. For example, what affect does the availability and scalability strategy have on serviceability (maintenance) of the system? What are the others costs of the strategies?
Identify bottlenecks. Examine the deployment design to identify any bottlenecks that cause the transmission of data to fall beneath requirements, and make adjustments.
Optimize resources. Review your deployment design for resource management and consider options that minimizes costs while fulfilling requirements.
Manage risks. Revisit your business and technical analysis, and modify your design to account for events or situations that your earlier planning did not foresee.

Estimating Processor Requirements

With a baseline figure established in the usage analysis, you can then validate and refine that figure to account for scalability, high availability, reliability, and good performance:

Steps to Estimate Processor Requirements

Customize the Baseline Sizing Figures

Validate Baseline Sizing Figures

Refine Baseline Sizing Figures

Validate Your Final Figures

The following sections describe these steps.

Customize the Baseline Sizing Figures

Establishing an appropriate sizing estimate for your Portal Server deployment is an iterative process. You might wish to change the inputs to generate a range of sizing results. Customizing your Portal Server deployment can greatly affect its performance.

After you have an estimate of your sizing, consider:

LDAP Transaction Numbers

Use the following LDAP transaction numbers for an out-of-the-box portal deployment to understand the impact of the service demand on the LDAP master and replicas. These numbers change once you begin customizing the system.These numbers are taken for the Developer Sample and include access manager work.

Access to authless anonymous portal - 6 BINDS, 49 SRCH
Login by using the Login channel - 1 BIND, 2 SRCH
Removing a channel from the Portal Desktop - 14 SRCH, 2MOD
Reloading the Portal Desktop - 0 ops

Web Container Requirements

One of the primary uses of Portal Server installed on an web container is to integrate portal providers J2EE technology stack constructs, such as Enterprise JavaBeans^TM in the case of the application server or things like Java Database Connectivity (JDBC) and LDAP (LDAPSDK) connectivity. These other applications and modules can consume resources and affect your portal sizing. It is best to use connection pooling and jndi tooling available through the web container.

Validate Baseline Sizing Figures

Now that you have an estimate of the number of CPUs for your portal deployment, use a trial deployment to measure the performance of the portal. Use load balancing and stress tests to determine:

Throughput, the amount of data processed in a specified amount of time
Latency, the period of time that one component is waiting for another component
Maximum number of concurrent sessions

Portal samples are provided with the Portal Server. You can use them, with channels similar to the ones you will use, to create a load on the system. The samples are located on the Portal Desktop.

Use a trial deployment to determine your final sizing estimates. A trial deployment helps you to size back-end integration to avoid potential bottlenecks with Portal Server operations.

Refine Baseline Sizing Figures

Your next step is to refine your sizing figure. In this section, you build in the appropriate amount of headroom so that you can deploy a portal site that features scalability, high availability, reliability and good performance.

Because your baseline sizing figure is based on so many estimates, do not use this figure without refining it.

When you refine your baseline sizing figure:

Use your baseline sizing figure as a reference point.
Expect variations from your baseline sizing figure.
Learn from the experience of others.
Use your own judgement and knowledge.
Examine other factors in your deployment.

If the Portal Server deployment involves multiple data centers on several continents and even traffic, you need a higher final sizing figure than if you have two single data centers on one continent with heavy traffic.
Plan for changes.

A portal site is likely to experience various changes after you launch it. Changes you might encounter include the following:
- An increase in the number of channels
- Growth in the user base
- Modification of the portal site’s purpose
- Changes in security needs
- Power failures
- Maintenance demands
  
  Considering these factors enables you to develop a sizing figure that is flexible and enables you to avoid risk when your assumptions regarding your portal change following deployment.
  
  The resulting figure ensures that your portal site has:
- Scalability high availability, reliability and high performance
- Room for whatever you want to provide
- Flexibility for adjusting to changes

Validate Your Final Figures

Use a trial deployment to verify that the portal deployment satisfies your business and technical requirements.

Identifying Performance Bottlenecks

By identifying potential bottlenecks during the design phase, you can plan for your portal performance needs.

Before reading the section on memory consumption and its affect on performance, read the following document on tuning the Java Virtual Machine, version 1.4.2. If you are deploying on Sun Java System Application Server or Sun Java System Web Server, the Java Virtual machine will be version 1.5. However if you are deploying on a third party web container it could be version 1.4.2.

http://java.sun.com/products/hotspot/index.html

Memory Consumption and Garbage Collection

Portal Server requires substantial amounts of memory to provide the highest possible throughput. At initialization, a maximum address space is virtually reserved but does not allocate physical memory unless needed. The complete address space reserved for object memory (heap) can be divided into young generation (eden) and old space (tenured). As the name suggests young means a area reserved for newly created Java objects, and old refers to space allocated to Java objects that have been around for a while.

Most applications suggest using a larger percentage of the total heap for the new generation, but in the case of Portal Server, using only one eighth the space for the young generation is appropriate, because most memory used by Portal Server is long-lived. The sooner the memory is copied to the old generation the better the garbage collection (GC) performance.

By default, even with a large heap size, after a portal instance has been running under moderate load for a few days, most of the heap appears to be used because of the lazy nature of the GC. The GC performs full garbage collections until the resident set size (RSS) reaches approximately 85 percent of the total heap space; at that point the garbage collections can have a measurable impact on performance.

For example, on a 900 MHz UltraSPARCIII^TM, a full GC on a 2 GB heap can take over ten seconds. During that period of time, the system is unavailable to respond to web requests. During a reliability test, full GCs are clearly visible as spikes in the response time. In production, full GCs can be unnoticed but monitoring scripts that measure system performance need to account for the possibility that a full GC can occur frequently.

Measuring the frequency of full GCs as a indicator may sometimes be useful in determining if a memory leak is present in the web container. Conduct an analysis that shows the expected frequency (of a baseline system) and compare that to the observed rate of full GCs. To record the frequency of GCs, use the vebose:gc JVM^TM parameter.

Optimizing Resources

You can optimize resources by using the following:

SSL Off-loading onto Load Balancer

SSL-intensive servers, such as the Secure Remote Access Gateway, require large amounts of processing power to perform the encryption required for each secure transaction. Using a load balancer with ssl capability can speed up the Portal Gateway's by off-loading the execution of cryptographic algorithms.

Terminating SSL traffic on a load balancer in a DMZ simplifies the portal topology. Access manager sessions are maintained in cookies and it is very important from a performance point of view that the correct servers are engaged in processing a browser request. For Example, those servers which have that particular session in its cache. Session stickiness is much easily achieved using cookies and http than using https to all the back-end servers.

Sun Enterprise Midframe Line

Normally, for a production environment, you would deploy Portal Server and Secure Remote Access on separate machines. However, in the case of the Sun Enterprise^TM midframe machines, which support multiple hardware domains, you can install both Portal Server and Secure Remote Access in different domains on the same Sun Enterprise midframe machine. The normal CPU and memory requirements that pertain to Portal Server and Secure Remote Access still apply; you would implement the requirements for each in the separate domains.

In this type of configuration, pay attention to security issues. For example, in most cases the Portal Server domain is located on the intranet, while the Secure Remote Access domain is in the DMZ.

Managing Risks

This section contains a few tips to help you in the sizing process.

A business-to-consumer portal requires that you deploy Secure Remote Access to use the Gateway and SSL. Make sure you take this into account for your sizing requirements. Once you turn on SSL, the performance of the portal can be up to ten times slower than without SSL.
For a business-to-employee portal, make sure that you have a user profile that serves as a baseline.
For any portal, build in headroom for growth. This means not just sizing for today’s needs, but future needs and capacity. This includes usual peaks after users return from a break, such as a weekend or holiday, or if usage is increased over time because the portal is more “sticky.”
If you are deploying your portal solution across multiple geographic sites, you need to fully understand the layout of your networks and data centers
Decide what type of redundancy you need. Consider items such as production down time, upgrades, and maintenance work. In general, when you take a portal server out of production, the impact to your capacity should be no more than one quarter of the overall capacity.
In general, usage concurrencies for a business-to-employee portal are higher than a business-to-consumer portal.