Oracle Internet Directory Administrator's Guide
Release 2.1.1

Part Number A86101-01

Library

Product

Contents

Index

Go to previous page Go to next page

14
Capacity Planning

Capacity planning is the process of assessing applications' directory access requirements and ensuring that the Oracle Internet Directory has adequate computer resources to service requests at an acceptable rate. This chapter explains what you need to consider when doing capacity planning. It guides you through an example of a directory deployment for an email messaging application in a hypothetical company called Acme Corporation

This chapter contains these topics:

About Capacity Planning

If Oracle Internet Directory and the corresponding Oracle8i database are running on the same computer, then these are the configurable resources that capacity planners need to consider:

When you plan to acquire hardware for Oracle Internet Directory, you should ensure that all components--such as CPU, memory, and I/O--are effectively used. Generally, good memory usage and a robust I/O subsystem are sufficient to keep the CPU busy.

Any new installation of the Oracle Internet Directory needs two things to be successful:

We begin by looking at an example of a directory deployment for an email messaging application in a hypothetical company called Acme Corporation. As we examine each component of the capacity plan, we will apply our recommendations to the example of Acme Corporation.

Terms Used Throughout This Chapter

Throughput 

The overall rate at which directory operations are being completed by Oracle Internet Directory. This is typically represented as "operations per second." 

Latency 

The time a client has to wait for a given directory operation to complete 

Concurrent clients 

The total number of clients that have established a session with Oracle Internet Directory 

Concurrent operations 

The amount of concurrent operations that are being executed on the directory from all of the concurrent clients. Note that this is not necessarily the same as the concurrent clients because some of the clients may be keeping their sessions idle. 

Getting to Know Directory Usage Patterns: A Case Study

The ability to assess the potential load on Oracle Internet Directory is very important for developing an accurate capacity plan. Let us examine the email messaging software employed by our hypothetical company, Acme Corporation. The email messaging software in this example is based on Internet Message Access Protocol (IMAP). There are two main types of software that access Oracle Internet Directory:

Let us assume that the private aliases and private distribution lists of individual users are also stored in the directory. Let us further make the following assumptions, which will allow us to guess the size of the directory:

Total user population 

40,000 

Average number of private aliases per person 

10 

Average number of private distribution lists per person 

10 

Total number of public distribution lists 

4000 

Total number of public aliases in the company 

1000 

Number of attributes in each entry in the directory related to this application 

20 

Number of cataloged attributes 

10 

Based on the above assumptions, we can derive the overall count of entries in Oracle Internet Directory as:

User entries 

40,000 (these represent the users themselves) 

Private aliases of users 

40,000 x 10 = 400,000 entries 

Private distribution lists of users 

40,000 x 10 = 400,000 entries 

Company wide distribution lists 

4000 

Company wide aliases 

1000 

The above assumptions will yield a directory population of about one million entries. Given the user population and the directory population, let us then analyze usage patterns so that we can derive performance requirements from them. A typical user tends to send an average of 10 emails per day and receives an average of 10 emails a day from the outside world. Assuming that there are, on an average, five recipients for each email being sent by a user, this would result in five directory lookups for each email.

The following table summarizes all the possible directory lookups that can happen in one day:

Type of Directory Lookup  Number of Directory Lookups In One Day 

The Mail Transfer Agent (MTA) processing outbound mail from each user 

5x10x40,000 = 2,000,000  

The MTA processing mails from the outside world 

10x40,000 = 400,000 

All other directory lookups (like IMAP clients validating certain addresses etc.) 

800,000 

Summing up, the total number of directory lookups per day would be about 3,200,000 (3.2 million) directory lookups per day. If these directory lookups were spread out uniformly along the day, it would require about 37 directory lookups per second (133,333 lookups per hour). Unfortunately, we will never have this case. Usage analysis of the current email system over a period of 24 hours shows the pattern illustrated in Figure 14-1.

Figure 14-1 Usage Analysis of Current Email System


Text Description of oid81016.gif follows
Text description of the illustration oid81016.gif

The email system (and Oracle Internet Directory) is stressed at its peak in the mornings. There are other usage peaks as well--one close to lunch time, and one near the end of business day. However, it is in the mornings that the Oracle Internet Directory is stressed the most.

Let us assume that 90 percent of all the directory lookups happen during normal working hours. Let us now split up the working hour load into the following categories (assuming an 8 hour workday):

Morning load 

65%: 0.90 x 0.65 x 3,200,000 = 1,872,000 lookups for 2 hours (936,000 lookups per hour) 

Afternoon load 

10%: 0.90 x 0.10 x 3,200,000 = 288,000 lookups for 1 hour (288,000 lookups per hour) 

Evening load 

20%: 0.90 x 0.20 x 3,200,000 = 576,000 lookups for 2 hours (288,000 lookups per hour) 

The above calculations indicate that the Oracle Internet Directory in this case should be designed to handle the peak load of 936,000 lookups per hour.

Now that we know the data-set size as well as the performance requirements, we can now look into individual components of the installation and estimate good values for each.

I/O Subsystem Requirements

This section contains these topics:

About the I/O Subsystem

The I/O subsystem can be compared to a pump that pumps data to the CPUs to enable them to execute workloads. The I/O subsystem is also responsible for data storage. The main components of an I/O subsystem are arrays of disk drives controlled by disk controllers.

It is important to consider performance requirements when you size the I/O subsystem, rather than size based only on storage requirements. Although disk drives have increased in size, the throughput--that is, the rate at which the disk drive pumps data--has not increased in proportion. In sizing calculations for the I/O subsystem, you should use the following factors as input:

Given a range of I/O subsystems, you should always opt for the highest throughput drives. Typically, one can maximize the I/O throughput by one or more of the following techniques:

Some guidelines for organizing Oracle Internet Directory-specific data files are provided in Chapter 15, "Tuning". Depending on the tolerance of disk failures, different levels of Redundant Arrays of Inexpensive Disks (RAID) can also be considered.

Assuming that the decision has been made to get the best possible I/O subsystem, we focus the next section on deriving sizing estimates for the disks themselves.

Rough Estimates of Disk Space Requirements

You can use the following table to derive a rough estimate of the overall disk requirement:

Number of Entries in DIT  Disk Requirements 

100,000 

450MB to 650MB 

200,000 

850MB to 1.5GB 

500,000 

2.5GB to 3.5GB 

1,000,000 

4.5GB to 6.5GB 

1,500,000 

6.5GB to 10GB 

2,000,000 

9GB to 13GB 

The data shown in the previous table makes the following assumptions:

Going back to our example of Acme Corporation, since our directory population is about one million, this would imply that our disk requirements are approximately 4.5 GB to 6.5 GB. Note that the assumptions made for Acme Corporation regarding the number of cataloged attributes are different, but the previous table should give an approximate figure of the size requirements.

Since the directory may be deployed for a wide variety of applications, these assumptions need not necessarily hold true for all possible situations: There might be cases where the size of attributes is large, the number of attributes per entry is large, extensive use of ACIs has been made, or the number of cataloged attributes is very high. For such cases, we present simple arithmetic procedures in the following section which will allow the planners to get a more detailed perspective of their disk requirements.

Detailed Calculations of Disk Space Requirements

Because Oracle Internet Directory stores all of its data in an Oracle8i database, the sizing for disk space is primarily a sizing of the underlying database. Oracle Internet Directory stores its data in the following tablespaces:

OLTS_ATTR_STORE 

Stores all of the attributes for all entries in the DIT 

OLTS_IND_ATTRSTORE 

Stores the indices pertaining to attributes in the directory 

OLTS_CT_DN 

Stores the distinguished name catalog 

OLTS_IND_CT_DN 

Stores the indices pertaining to the DN catalog 

OLTS_CT_CN 

Stores the common name catalog 

OLTS_CT_OBJCL 

Stores the ObjectClass catalog 

OLTS_CT_STORE 

Stores all the remaining (including user-defined) catalogs 

OLTS_IND_CT_STORE 

Stores the indices pertaining to the user-defined catalogs 

OLTS_DEFAULT 

Stores all of the data pertaining to the administration of the Oracle Internet Directory as well as the data used for replication support 

OLTS_TEMP 

Used for creating various indices on the tables. It should be large enough so that all index creations can go through. 

SYSTEM 

Required by Oracle8i database for various book-keeping purposes. Typically, its size remains constant at about 300MB. 

This section presents simple arithmetic procedures to determine the size requirements of each of the tablespaces shown above. All of the size calculations are based on the following variables:

Variable Name  Description 

num_entries 

Total number of entries in the directory 

attrs_per_entry 

Average number of attributes per directory entry 

avg_attr_size 

Average size of the attribute in bytes 

avg_dn_size 

Average size of the DN of an attribute in bytes 

objectclass_per_entry 

Average number of object classes that an entry belongs to 

objectclass_size 

Average size of the name of each objectclass in bytes 

num_cataloged_attrs 

Number of cataloged attributes used in the entries 

entries_per_catalog 

Average number of entries per catalog table. This is required because not all cataloged attributes will be present in all entries in the DIT. 

change_log_capacity 

Number of changes that we wish to buffer for replication purposes 

num_acis 

Overall number of ACIs in the directory 

num_auditlog_entries 

Number of auditlog entries to store in the directory 

db_storage_ovhd 

Overhead of storing data in tables. This overhead corresponds to the relational constructs as well as operating system specific overhead. A value of 1.3 for this variable would represent a 30 percent overhead. The minimum value for this variable is 1. 

db_index_ovhd 

Overhead of storing data in indices. This overhead corresponds to the relational constructs as well as the operating system specific overhead. A value of 5 for this variable would represent a 400 percent overhead. The minimum value of this variable is 1. 

factor_of_safety 

Multiplier for accommodating growth and errors in calculations. A value of 1.3 for this variable would represent a 30 percent factor of safety. The minimum value for this variable is 1. 

Using the variables shown in the preceding table, the size of individual tablespaces can be calculated as follows:

Tablespace Name  Size 

OLTS_ATTR_STORE 

num_entries * attrs_per_entry * avg_attr_size * db_storage_ovhd 

OLTS_IND_ATTRSTORE 

num_entries * attrs_per_entry * 30 

OLTS_CT_DN 

num_entries * 2 * avg_dn_size 

OLTS_IND_CT_DN 

num_entries * 2 * (avg_dn_size + 30) 

OLTS_CT_CN 

num_entries * avg_dn_size * db_storage_ovhd 

OLTS_CT_OBJCL 

(num_entries * objectclass_per_entry * objectclass_size *db_storage_ovhd) + (num_auditlog_entries * 2 * avg_dn_size * db_storage_ovhd) 

OLTS_CT_STORE 

(entries_per_catalog * num_cataloged_attrs * avg_attr_size * db_storage_ovhd) + (num_entries * objectclass_per_entry * objectclass_size * db_storage_ovhd) 

OLTS_IND_CT_STORE 

(entries_per_catalog * num_cataloged_attrs * avg_attr_size * db_index_ovhd) + (num_entries * objectclass_per_entry * objectclass_size * db_index_ovhd) + (num_acis * 1.5 * avg_dn_size * db_index_ovhd) + (num_auditlog_entries * 2 * avg_dn_size * db_index_ovhd) 

OLTS_DEFAULT 

(change_log_capacity * 4 * avg_attr_size * db_storage_ovhd * db_index_ovhd) + (num_entries * 5) 

OLTS_TEMP 

(size of OLTS_IND_ATTR_STORE) + (size of OLTS_IND_CT_STORE) 

SYSTEM 

300 MB 

Using the arithmetic operations shown in the preceding table, one can compute the exact space requirements for a wide variety of Oracle Internet Directory deployment scenarios. The sum of the sizes of each of the tablespaces should yield the overall database disk requirement. One can optionally multiply that by the "factor_of_safety" variable to get a figure that can compensate for unforeseen circumstances.

Going back to our example of Acme Corporation, we can assign values to each of the variables based on the requirements stated in previous sections. The following table illustrates the values of each variable introduced in this section for Acme Corporation.

Variable Name  Value 

num_entries 

1,000,000 

attrs_per_entry 

20 

avg_attr_size 

32 bytes 

avg_dn_size 

40 bytes 

objectclass_per_entry 

5 (each entry belongs to an average of 5 object classes) 

objectclass_size 

10 bytes 

num_cataloged_attrs 

10 

entries_per_catalog 

1,000,000 

change_log_capacity 

80,000 changes (2 per user) 

num_acis 

80,000 ACIs (2 per user) 

num_auditlog_entries 

1000 

db_storage_ovhd 

1.4 (40% overhead) 

db_index_ovhd 

5.0 (400% overhead) 

factor_of_safety 

1.5 (50% factor of safety) 

If we now plug these values into the equations described earlier, we get the following values:

Tablespaces Name
 
Size in Bytes
 
Size in MB
 
Size in MB (with factor of safety) 

OLTS_ATTRSTORE 

896000000 

875 

1313 

OLTS_IND_ATTRSTORE 

600000000 

586 

879 

OLTS_CT_DN 

80000000 

78 

117 

OLTS_IND_CT_DN 

140000000 

137 

205 

OLTS_CT_CN 

56000000 

55 

82 

OLTS_CT_OBJCL 

70112000 

68 

103 

OLTS_CT_STORE 

518000000 

506 

759 

OLTS_IND_CT_STORE 

1874400000 

1830 

2746 

OLTS_DEFAULT 

76680000 

75 

112 

OLTS_TEMP 

2474400000 

2416 

3625 

SYSTEM 

307200000 

300 

450 

Total Size 

7092792000 

6927 

10390 

The table above shows that the estimated size of the database for Acme Corporation would be about 6.9 GB. With a 50 percent factor of safety, this would jump to 10.4GB. If all of the data is being loaded in bulk, then the bulkload tool of Oracle Internet Directory would require an additional 50 percent of space occupied by the database to store its temporary files. For Acme Corporation, this would add about 2.25 GB to 3.35 GB to the total space requirement.

Memory Requirements

Memory is used for a number of distinct tasks by any database application, including Oracle Internet Directory. If memory resources are insufficient for any of these tasks, the bottleneck causes the CPUs to work at lower efficiency and system performance to drop. Furthermore, memory usage increases in proportion to the number of concurrent connections to the database and the number of concurrent users of the directory.

The memory available to processes comes from the virtual memory on the system, which is somewhat more than available physical memory. If the sum of all active memory usage exceeds the available physical memory on the system, the operating system may need to store some of the memory pages on disk. This is called paging. Paging can degrade performance if memory is too oversubscribed. Generally, you should not exceed 20 percent over-subscription of physical memory. If paging occurs, you need either to scale back memory usage by processes or to add more physical memory. Keep in mind the trade-offs: There are physical limits to the amount of memory you can add, but scaling back on per-process memory usage can significantly degrade performance.

The main consumer of memory is the database buffer cache within the System Global Area (SGA). The more memory allocated to this, the better will be the buffer cache hit ratio. A good buffer cache hit ratio will result in good database performance which in turn will result in good performance of the Oracle Internet Directory.

See Also:

Chapter 15, "Tuning" for further information on SGA tuning 

The following table gives minimum memory requirements for different directory configurations:

Directory Type  Entry Count  Minimum Memory 

Small 

Less than 600,000 

512 MB 

Medium 

600,000 to 2,000,000 

1 GB 

Large 

Greater than 2,000,000 

2 GB 

Going back to our example of Acme Corporation, the number of entries in the directory are close to 1,000,000 (1 million). Oracle Corporation recommends choosing the 2 GB option in order to maximize performance.

Network Requirements

The network is rarely a bottleneck in most installations. However serious consideration must be given to it during the capacity planning stage. If the clients do not get adequate network bandwidth to send and receive messages from Oracle Internet Directory, the overall throughput will seem to be very low. For example, if we have configured Oracle Internet Directory to service 800 search operations per second, but the computer running the Oracle directory server is only accessible through a 10 Mbps network (10-Base-T switched ethernet), and we have only 60 percent of the bandwidth available, then the clients will only see a throughput of 600 search operations a second (assuming each search operation causes 1024 bytes to be transferred on the network). The following table shows the maximum possible throughput (in operations per second) for two types of operations (one requiring a transfer of 1024 bytes the other requiring a transfer of 2048 bytes) for two types of networks, 10 Mbps & 100 Mbps, at different rates of bandwidth availability:

In some cases, it may also be important to consider the network latency of sending a message from a client to the Oracle directory server. In some WAN implementations, the network latencies may become as high as 500 milliseconds, which may cause the clients to time out for certain operations. In summary, given a range of networking options, the preferred choice should always be for highest bandwidth, lowest latency network.

Going back to the example of Acme Corporation, their peak usage rate is 936,000 lookups per hour which results in an equivalent number of lookup operations to the directory. This requires about 260 directory operations per second. Assuming that each operation results in a transfer of 2 KB of data on the network, this would imply that we should have a 100 Mbps network or at least 60 percent bandwidth available on a 10 Mbps network. Since the 100 Mbps network will typically have a lower latency, we will chose that over the 10 Mbps network.

CPU Requirements

This section contains these topics:

CPU Configuration

The CPU sizing for Oracle Internet Directory is directly a function of the user workload. The following factors will determine CPU configuration:

CPU resources can be added to a system as the workload increases, but these additions seldom bring linear scalability to all operations since a lot of operations are not purely CPU bound. We classify the processing power of a computer by a performance characteristic that is commonly available from all vendors, namely, SPECint_rate95 baseline. This number is derived from a set of integer tests and is available from all system vendors as well as the SPEC web site (www.spec.org).


Note:

SPECint_rate95 should not be confused with the regular SPECint95 performance number. The SPECint95 performance number gives an idea of the integer processing power of a particular CPU (for systems with multiple CPUs, this number is typically normalized). The SPECint_rate95 gives the integer processing power of an entire system without any normalization.  


Because Oracle Internet Directory makes efficient use of multiple CPUs on an SMP computer, we chose to categorize computers based on their SPECint_rate95 numbers. Even within SPECint_rate95 we chose the baseline number as opposed to the commonly advertised result. This is because the commonly advertised result is actually the peak performance of a computer, whereas the baseline number represents the performance in normal circumstances.

Rough Estimates of CPU Requirements

Since Oracle Internet Directory is typically co-resident with the Oracle8i database, we recommend at least a two-CPU system. We give the following rough estimates based on the level of usage of Oracle Internet Directory:

Usage  Num CPUs  SPECint_rate95 baseline  System 

Departmental 

60 to 200 

Compaq AlphaServer 8400 5/300 (300Mhz x 2) 

Organization wide 

200 to 350 

IBM RS/6000 J50 (200MHz x 4) 

Enterprise wide 

4+ 

350+ 

Sun Ultra 450 (296 MHz x 4) 

Detailed Calculations of CPU Requirements

It is difficult to determine the CPU requirements for all operations at a given deployment site since the amount of CPU consumed depends upon several factors, such as:

In most of the cases, except SSL, we can expect that there is a large latency between the Oracle Internet Directory server process and the database. When a thread in the Oracle Internet Directory server process is waiting for the database to respond, other threads within the Oracle Internet Directory server process can be put to work by other client requests needing LDAP server specific processing. As a result, for any mix of operations, one can always come up with a combination of concurrent clients and Oracle Internet Directory server processes that will result in 100 percent CPU utilization. In this case, the CPU becomes the bottleneck.

Given this fact, we have taken the operation that consumes the smallest number of CPU cycles: a base search and estimated the number of concurrent operations at which we peaked on CPU usage on various computers. We then correlated this to SPECint_rate95 baseline number of the computers. With this correlation, given a certain amount of concurrency on the user load, one can find a lower bound on the processing power required by Oracle Internet Directory. The following formula gives the concurrency to SPECint_rate95 baseline number for this release of Oracle Internet Directory:

SPECint_rate95 baseline = 6.0 * (concurrent base search operations)

For example, if we need a computer that is capable of handling 50 concurrent base search operations before saturating the CPU, we would require a computer that has a SPECint_rate95 baseline rating of about 300.

Taking this number as the baseline, we can find the CPU requirements of other operations if we express them as some factor of the base search operations. The following factors may be used in addition to others:

Going back to our example of Acme Corporation, let us assume that we want adequate CPU resources to support about 100 concurrent operations. Assuming that each search returns 1.5 entries, and adding a factor of safety of 20 percent, our preliminary estimate of the CPU requirements would be:

SPECint_rate95 baseline=6.0*100*(1 + 0.2*1.5)*1.2 = 600*1.3*1.2 = 936

Looking at the available systems from the SPEC web site (www.spec.org) we can see that the following computer configurations would be the smallest configurations that should be considered.

The next table shows some of the computers that Acme Corporation can consider using for Oracle Internet Directory.

Company  Model  CPUs  CPU type  SPECint95_rate baseline 

Sun Microsystems 

ES 4002 

12 

250MHz UltraSPARC II 

943 

Siemens Nixdorf  

RM600 Model E60 

250 MHz R10000 

970 

Hewlett-Packard 

HP SPP1600 

32 

120 MHz PA-RISC 7200 

996 

SGI 

Origin2000 

250 MHz MIPS R10000  

1001 

Data General Corporation 

AViiON AV 20000 

16 

Pentium Pro (200 MHz) 

1007 

Sun Microsystems 

Sun Enterprise 3500 

400MHz UltraSPARC II 

1011 

Sun Microsystems 

Sun Enterprise 3500 

400MHz UltraSPARC II 

1030 

Hewlett-Packard 

HP 9000 Model N4000 

440 MHz PA-RISC 8500 

1093 

Hewlett-Packard 

HP 9000 Model T600 

12 

180MHz PA-RISC 8000 

1099 

Siemens AG 

RM600 Model E80 

285 MHz R12000  

1103 

Compaq Corporation 

AlphaServer 8400 5/440 

12 

437 MHz 21164 

1146 

Compaq Corporation 

AlphaServer 8400 5/625 

612 MHz 21164 

1153 

SGI 

origin2000 

16 

195 MHz MIPS R10000  

1182 

Sun Microsystems  

Sun Enterprise 4000 

12 

336MHz UltraSPARC II 

1211 

Summary of Capacity Plan for Acme Corporation

In the preceding sections, we have described various components involved in capacity planning and have also shown how each of them would apply to an Oracle Internet Directory deployment at a hypothetical company named Acme Corporation. In this section we give a quick summary of all of the recommendations made. Following were the initial assumptions:

Based on the above requirements and further assumptions, we developed the following recommendations:

Several simplifying assumptions were made so that the sizing calculations could be more intuitive.


Go to previous page Go to next page
Oracle
Copyright © 1996-2000, Oracle Corporation.

All Rights Reserved.

Library

Product

Contents

Index