Sun Java System Portal Server 7 Deployment Planning Guide

Usage Analysis for Portal Server

You need to establish baseline sizing figures that can be used in the logical and architecture and deployment design. Your technical representative can provide you with an automated sizing tool to calculate the estimated number of CPUs your Portal Server deployment requires.

Note –

Sizing requirements for a secure portal deployment using Sun Java^TM System Secure Remote Access (SRA) software are covered in Usage Analysis for SRA.

You need to gather the following metrics for input to the sizing tool:

Other performance metrics that affect the number of CPUs a Portal Server deployment requires, but are not used by the sizing tool, are:

Portal Desktop Configuration
Hardware and Applications
Back-end Servers
Transaction Time
Workload Conditions

A discussion of the these performance factors follows.

Peak Numbers

Maximum number of concurrent sessions defines how many connected users a Portal Server deployment can handle.

To calculate the maximum number of concurrent sessions, use this formula:

maximum number of concurrent sessions =
expected percent of users online * user base

To identify the size of the user base or pool of potential users for an enterprise portal, here are some suggestions:

Identify only users who are active. Do not include users who are, for example, away on vacation, or on leave.
Use a finite figure for user base. For an anonymous portal, estimate this number conservatively.
Study access logs.
Identify the geographic locations of your user base.
Remember what your business plan states regarding who your users are.

Average Time Between Page Requests

Average time between page requests is how often, on average, a user requests a page from the Portal Server. Pages could be the initial login page to the portal, or a web site or web pages accessed through the Portal Desktop. A page view is a single call for a single page of information no matter how many items are contained on the page.

Though web server logs record page requests, using the log to calculate the average time between requests on a user basis is not feasible. To calculate the average time between page requests, you would probably need a commercially available statistics tool, such as the WebLoad performance testing tool. You can then use this figure to determine the number of concurrent users.

Note –

Page requests more accurately measure web server traffic than “hits.” Every time any file is requested from the web server counts as a hit. A single page call can record many hits, as every item on the page is registered. For example, a page containing 10 graphic files records 11 “hits”—one for the HTML page itself and one for each of the 10 graphic files. For this reason, page requests gives a more accurate determination of web server traffic.

Concurrent Users

A concurrent user is one connected to a running web browser process and submitting requests to or receiving results of requests from Portal Server. The maximum number of concurrent users is the highest possible number of concurrent users within a predefined period of time. Calculate the maximum number of concurrent users after you calculate the maximum number of concurrent sessions. To calculate the maximum number of concurrent users, use this formula:

concurrent users = number of concurrent sessions / average time between hits

For example, consider an intranet Portal Server example of 50,000 users. The number of connected sessions under its peak loads is estimated to be 80% of its registered user base. On average, a user accesses the Portal Desktop once every 10 minutes.

The calculation for this example is:

40000 / 10 = 4000

The maximum number of concurrent users during the peak hours for this Portal Server site should be 4,000.

Average Session Time

Average session time is the time between user login and logout averaged over a number of users. The length of the session time is inversely proportional to the number of logins occurring (that is, the longer the session duration, the fewer logins per second are generated against Portal Server for the same concurrent users base). Session time is the time between user login and user logout.

How the user uses Portal Server often affects average session time. For example, a user session involving interactive applications typically has a longer session time than a user session involving information only.

Search Engine Factors

If your portal site will offer a Search channel, you need to include sizing factors for the Search Engine in your sizing calculations. Search Engine sizing requirements depend on the following factors:

The size of index partitions on the active list of the index directory

Partition size is directly proportional to the size and number of indexed and searchable terms.
Average disk space requirement of a resource description (RD)

To calculate this, use this formula:
```
average disk space requirement =
database size / number of RDs in database
```
The average size adjusts for variations in sizes of RDs. A collection of long, complex RDs with many indexed terms and a list of short RDs with a few indexed terms require different search times, even if the complex RDs have the same number of RDs.

RDs are stored in a hierarchical database format, where the intrinsic size of the database must be accounted for, even when no RD is stored.
The number of concurrent users who perform search-related activities

To calculate this, use this formula:
```
number of concurrent users / average time between search hits
```
Use the number of concurrent users value calculated in Concurrent Users.
The type of search operators used

Types of search functions include basic, combining, proximity, passage and field operator, and wildcard scans. Each function uses different search algorithms and data structures. Because differences in search algorithms and data structures increase as the number of search and indexed terms increase, the type of search function affects times for search result return trips.

Page Configuration

If you are using an authenticated portal, you must specify both Login Type and Desktop Type in the page configuration section of the automated sizing tool.

Login Type. Describes the type of portal page (content configuration and delivery method) that end users initially see after submitting user name and password. This process s typically taxing on the system because the process involves checking credentials, initializing the session, and delivering initial content.

The Measured CPU Performance characteristic associated with the Login Type is the Initial Desktop Display variable.
Desktop Type. Describes the type of portal pages (content configuration and delivery method) that end users see after the initial portal page. These pages are displayed with each subsequent interaction with the portal, or on Desktop refresh. Because the session has already been established and cached content can be exploited, less system resources are typically required and the pages are delivered more rapidly.

The Measured CPU Performance characteristic associated with the Desktop Type is the Desktop Reload variable.

For both Login Type and Desktop Type, select the appropriate content configuration:

Light-JSP. Describes a configuration of two tabs with five channels each.
Regular-JSP. Describes a configuration of two tabs with seven channels each.
Heavy-JSP. Describes a configuration of three tabs with seventeen channels each.

Tip –

You can now give the above figures to your technical representative and ask that the sizing tool be run to identify your estimated number of CPUs.

Portal Desktop Configuration

Portal Desktop configuration explicitly determines the amount of data held in memory on a per-session basis.

The more channels on the Portal Desktop, the bigger data session size, and the lesser the throughput of Portal Server.

Another factor is how much interactivity the Portal Desktop offers. For example, channel clicks can generate load on Portal Server or on some other external server. If channel selections generate load on Portal Server, a higher user activity profile and higher CPU overhead occur on the node that hosts the Portal Desktop than on a node that hosts some other external server.

Hardware and Applications

CPU speed and size of the virtual machine for the Java^TM platform (Java^TM Virtual Machine or JVM^TM software) memory heap affect Portal Server performance.

The faster the CPU speed, the higher the throughput. The JVM memory heap size, along with the heap generations tuning parameters, can also affect Portal Server performance.

Back-End Servers

Portal Server aggregates content from external sources. If external content providers cannot sustain the necessary bandwidth for Portal Server to operate at full speed, Portal Desktop rendering and throughput request times will not be optimum. The Portal Desktop waits until all channels are completed (or timed out) before it returns the request response to the browser.

Plan your back-end infrastructure carefully when you use channels that:

Scrape their content from external sources
Access corporate databases, which typically have slow response times
Provide email content
Provide calendar content

Transaction Time

Transaction time, which is the delay taken for an HTTP or HTTPS operation to complete, aggregates send time, processing time, and response time figures.

You must plan for factors that can affect transaction time. These include:

Network speed and latency.

You need to especially examine latency over a Wide Area Network (WAN). Latency can significantly increase retrieval times for large amounts of data.
The complexity of the Portal Desktop.
The browser’s connection speed.

For example, a response time delay is longer with a connection speed of 33.6 kilobytes per second than with a LAN connection speed. However, processing time should remain constant. Transaction time through a dial-up connection should be faster than transaction time displayed by a load generation tool because it performs data compression.

When you calculate transaction time, size your Portal Server so that processing time under regular or peak load conditions does not exceed your performance requirement threshold and so that you can sustain processing time over time.

Workload Conditions

Workload conditions are the most predominantly used system and JVM software resources on a system. These conditions largely depend on user behavior and the type of portal you deploy.

The most commonly encountered workload conditions on Portal Server software affect:

System performance

Portal Server performance is impacted when a large number of concurrent requests are handled (such as a high activity profile). For example, during peak hours in a business-to-enterprise portal, a significant number of company employees connect to the portal at the same time. Such a scenario creates a CPU-intensive workload. In addition, the ratio of concurrent users to connected users is high.
System capacity

Portal Server capacity begins to be impacted when large numbers of users log in. As more users login, users use more of the available memory, and subsequently, less memory is available to process requests made to the server. For example, in a business-to-consumer web portal, a large number of logged-in users are redirected to external web sites once the initial Portal Desktop display is loaded. However, as more users continue to login, users create the need for more memory, even though the ratio of users submitting requests to Portal Server and the users merely logged-in is low.

Depending on the user’s behavior at certain times of the day, week, or month, Portal Server can switch between CPU-intensive and memory-intensive workloads. The portal site administrator must determine the most important workload conditions to size and tune the site to meet the enterprise’s business goals.