1 Capacity Planning

Capacity planning consists of selecting the right equipment for an Oracle Access Manager deployment based on anticipated usage. For most deployments, Oracle Access Manager functions well using standard off-the-shelf equipment.

When planning for an Oracle Access Manager installation, you want to be sure that your equipment is adequate for handling peak loads.

This chapter provides information on a medium-sized deployment of 20,000 users and a large deployment of 100,000-2,000,000 users.

This chapter includes the following topics:

About Capacity Planning
Estimating Peak Load
Capacity Planning for Multiple Environments
Example from an Actual Deployment

For an overview of Oracle Access Manager, see the Oracle Access Manager Introduction manual.

1.1 About Capacity Planning

Capacity planning is the process of determining your hardware and memory requirements. The goal of capacity planning is to maintain good system performance during times of peak load. There are no rules when it comes to capacity planning. A quote from an anonymous author on the Web states: The reasonable answer to any performance question begins with "it depends."

Having said that, it is important to use the right equipment for the task. In general, it is cost-effective to have more equipment than you need than it is to try to make do with inadequate hardware. The cost of extra hardware is low compared to the effort required to maintain an under-powered system. Oracle Access Manager components tend to function well on standard hardware such as the machines listed in this section. In fact, if you use the information in this section as the basis for your capacity planning, this may be entirely adequate for your environment.

If you want to take a more methodical approach to equipment planning, a general guideline for capacity planning is to estimate your peak load and to purchase equipment that is capable of handling peak loads. This chapter presents methods of estimating peak usage.

Note:

You can spend a considerable amount of time calculating your requirements. However, following the recommendations for typical hardware configurations in this chapter may ultimately be the simplest and best choice.

A key point: Remember that it is hard to predict all of the variables in your network. As a result, most calculations of peak usage are error-prone and may ultimately be less reliable than following a few simple guidelines.

1.2 Estimating Peak Load

To ensure that you have adequate equipment for your environment, your machines need to be able to handle the maximum number of operations that can be expected in a particular time interval. The equipment that you use in your Oracle Access Manager installation should be able to accommodate your users during times of peak load.

The information in this section is divided into the following topics:

General guidelines for estimating peak load

These guidelines are somewhat reductionist, but they are reasonable in light of the fact that network performance tends to be unpredictable. If you can estimate your peak system throughput, you generally can predict what class of server you need.
Specific methods

You can use more labor-intensive methods of predicting requirements by analyzing estimated system usage. These more complex methods, however, may or may not result in better overall predictions of hardware requirements.

1.2.1 General Guidelines for Estimating Peak Load

A simple method for estimating peak usage is as follows:

Measure usage over at least a day, and preferably over several weeks.

If usage tends to spike during particular weeks of the year, try to obtain measurements from the busiest weeks.
Use the highest value seen in production.
Estimate the what a typical busy load is.

To estimate a typical busy load, multiply the value of an average heavy load by a small integer such as 2 or 3. This allows for usage patterns that are two or three standard deviations higher than an average heavy load, assuming a Gaussian distribution (bell curve) of loads.

If you have a multi-site deployment, you may want to create a chart of peak usage for all sites and estimate peak load based on total estimated usage across the sites. For instance, you can record the number of logged-in users at different times of the day. The following table illustrates this method of estimation:

Peak hours GMT	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24
Mexico	0	0	0	3	75	150	175	225	225	225	225	225	225	225	225	175
Spain	35	50	75	50	25	35	50	75	75	75	75	35	20	0	0	0
Egypt	45	45	50	50	50	50	50	55	55	45	30	10	0	0	0	0
U.S.	0	2	5	10	30	80	95	95	95	95	86	80	80	90	75	60
Columbia	0	2	7	15	30	45	45	50	50	50	55	55	55	55	45	35
Costa Rica	0	0	0	0	2	5	10	10	10	12	12	12	12	12	12	12
Indonesia	60	45	30	10	4	2	0	0	0	0	0	0	0	0	3	5
Taiwan	125	115	100	60	30	5	0	0	0	0	0	0	0	10	25	75
Total	265	269	267	198	372	425	425	510	510	502	483	417	392	392	385	362

If you can estimate the number of transactions per user during your peak hours, you have an estimate of your overall system capacity. You can compare your estimates of transactions-per-second to your equipment manufacturer's estimates for your hardware. Based on these comparisons, you can determine if the machines you already have are adequate for supporting the estimated load, and if not, you can base your equipment choices in part on your throughput requirements.

Note:

Even this method of estimation may be more rigorous than is required for a deployment of fewer than 20,000 users. Standard server class hardware is adequate for most deployments.

1.2.2 Complex Estimates of Peak Load

As an alternative to taking simple measurements of peak usage patterns, you can use an estimation method. This may be practical if there is no simple way to measure usage--for instance, if you have a geographically distributed environment. For a geographically distributed environment, you can identify the hours of peak usage and estimate the number of users who are present during those times. It may be more practical to make these estimates rather than trying to gather statistics on who is logged in at each office.

For instance, suppose your company has offices in a variety of countries. You can create a chart of regular office hours plotted against the GMT time, as illustrated below. The table below allows you to estimate the times (GMT) when the majority of your offices are busy, as indicated by the shaded area in the above table. This is a good indication of peak load hours:

GMT	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24
	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24
Mexico	19	20	21	22	23	24	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18
Spain	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24	1
Egypt	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24	1	2
U.S.	20	21	22	23	24	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19
Columbia	20	21	22	23	24	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19
Costa Rica	19	20	21	22	23	24	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18
Indonesia	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24	1	2	3	4	5	6	7	8
Taiwan	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24	1	2	3	4	5	6	7	8

Using Human Resources data on the number of users per office, you can estimate the total number of users accessing your resources during peak hours.

Country	Full time	Part time	Contract	Total
Mexico	3021	496	35	3552
Spain	755	356	5	1116
Egypt	329	275	0	604
U.S.	134	25	55	214
Columbia	1290	245	11	1546
Costa Rica	175	130	0	305

Again, you can create a simple estimate of throughput based on an estimate of the maximum number of users that you expect to be logged in at the same time, the estimated number of transactions per user for a given time period, and ultimately the estimated transactions-per-second that need to be supported. You can compare your transactions-per-second estimates with claims made by your hardware vendors.

1.2.3 Predictions of Concurrency and Throughput

You can calculate peak requirements by estimating either concurrency or throughput. Oracle Access Manager is a stateless system. As a result, a discussion of concurrency is not entirely meaningful and estimates of throughput are more useful. For example, a user can log in but let their account remain idle until they time out. This user is accessing the account concurrently with other users, but their effect on throughput may be minimal.

However, you may want to use estimates of concurrency as a partial method of predicting system requirements. Based on an estimate of concurrency and an estimate of average number of transactions per user per minute, you can estimate system load.

For example, the Response-Time Law states that the number of requests per second (X_o) is equal to the number of concurrently logged-in users divided by response time and think time:

X_o = N/R+Z

For example, for a community of 5,000 users with an estimated response time of 3 seconds and an estimated user wait or think time of 2 seconds, the estimated requests per second would be 1,000.

Little's Law states that the number of simultaneous connections is equal to the throughput times the response time. That is:

N (concurrency) = Xo (requests per second) * Rsite (response time for the web site)

Based on a network response time of 1 second, the Web site time would be 2 seconds (3-1), and the estimated concurrency would be 1,000 * 2 or 2,000 users. Assuming this is an estimate of concurrency during average load, you would want to multiply this concurrency by 2 or 3, for a peak load estimate of 6,000 users. This can be the basis for estimating your hardware requirements.

Note:

Little's Law is a simple equation to calculate. However, it may also be relatively simple to base the calculation on an incorrect assumption.

You can also use information about the number of simultaneously logged in users to construct a poisson distribution of the estimated number of concurrent users. A poisson distribution is often used as a model for the number of events, such as the number of telephone calls at a business or the number of accidents at an intersection, in a specific time period.

The following are input parameters for a poisson distribution of concurrent users:

The total number of users who may be logged in during peak hours.
The time period for critical usage.
The estimated duration of a user session.

Information on calculating a poisson distribution is readily available on the Web and in textbooks and will not be covered further in this chapter.

Note:

Keep in mind that although a complex analysis may help you feel more confident about system requirements estimates, there may be a number of unpredictable elements on your network that interfere with the accuracy of your estimates. As a result, using the simpler guidelines and recommendations in this chapter may, in fact, be the most efficient method for estimating peak usage.

1.3 Capacity Planning for Multiple Environments

A standard Oracle Access Manager deployment consists of several instances of various components installed for different purposes:

Development and Test

This is an area where you configure and test a fully deployable system. On a development or test server, a smaller amount of RAM may be required. For development and test systems, you can run the Web server on the same system as the Identity Server, assuming that only a small development team is using the machines rather than the whole user population.
Staging

This is an area where new application rollouts, software upgrades, and performance benchmarking can be performed without affecting your production and development environments. This environment can closely resemble the production environment, though it may be a scaled down image of it. For instance, a smaller amount of RAM may be used.
Production

This is an area where your end users can access the deployed system. Oracle recommends you use a replicated directory and Oracle Access Manager's native load-balancing features. You may also want to configure the Oracle Access Manager Servers for failover and load balancing and have separate machines for your Web servers. As an alternative, you can install multiple Web server instances on a single machine.

1.4 Example from an Actual Deployment

The following diagram Figure 1-1 illustrates hardware and software choices in an actual mid-sized (10,000-20,000 user) deployment. Using a poisson distribution, the estimated concurrency was predicted to be 74 users, with a 0.01% probability of having 100 or more users.:

Figure 1-1 Hardware and Software Choices in an Actual Mid-Sized Deployment

For the production environment:

The servers are running Windows and Active Directory, IIS, and Oracle Access Manager.
Each server has two CPUs and 2 GB of RAM, and about 10 GB of disk space.
There are two Identity Servers running on different machines.
Multiple WebPass instances are installed in several intranet IIS servers, with at least one server acting as the administration console for the Identity System.
There are two Access Servers running on different machines.
There is one WebGate installed on each IIS intranet server.
The administrative console for the Access System is installed on an IIS server on one of the two Access and Identity Server boxes.

This environment could be scaled as follows:

The number of Oracle Access Manager Servers can be increased to spread the load and offer higher redundancy.
The number of CPUs, memory, and disk space can be increased to improve server performance.
The Active Directory environment supports the addition of new servers for cases where this is desired to improve performance and reliability.