Skip Headers
Oracle® Internet Directory Administrator's Guide,
10g Release 2 (10.1.2)
B14082-02
  Go To Documentation Library
Home
Go To Product List
Solution Area
Go To Table Of Contents
Contents
Go To Index
Index

Previous
Previous
Next
Next
 

20 Capacity Planning for the Directory

Capacity planning is the process of assessing applications' directory access requirements and ensuring that the Oracle Internet Directory has adequate computer resources to service requests at an acceptable rate. This chapter explains what you need to consider when doing capacity planning. It guides you through an example of a directory deployment for an e-mail messaging application in a hypothetical company called Acme Corporation

This chapter contains these topics:

20.1 About Capacity Planning

If Oracle Internet Directory and the corresponding Oracle Database are running on the same computer, then these are the configurable resources that capacity planners need to consider:

When you plan to acquire hardware for Oracle Internet Directory, you should ensure that all components—such as CPU, memory, and I/O—are effectively used. Generally, good memory usage and a robust I/O subsystem are sufficient to keep the CPU busy.

To be successful, every new installation of the Oracle Internet Directory requires:

Table 20-1 defines important terms used in this chapter.

Table 20-1 Capacity Planning Terminology

Term Definition

Throughput

The overall rate at which directory operations are being completed by Oracle Internet Directory. This is typically represented as "operations every second."

Latency

The time a client has to wait for a given directory operation to complete

Concurrent clients

The total number of clients that have established a session with Oracle Internet Directory

Concurrent operations

The amount of concurrent operations that are being executed on the directory from all of the concurrent clients. Note that this is not necessarily the same as the concurrent clients because some of the clients may be keeping their sessions idle.


In this chapter, we look at an example of a directory deployment for an e-mail messaging application in a hypothetical company called Acme Corporation. As we examine each component of the capacity plan, we apply our recommendations to the example of Acme Corporation.

20.2 Getting to Know Directory Usage Patterns: A Case Study

The ability to assess the potential load on Oracle Internet Directory is very important for developing an accurate capacity plan. Let us examine the e-mail messaging software employed by our hypothetical company, Acme Corporation. The e-mail messaging software in this example is based on Internet Message Access Protocol (IMAP). There are two main types of software that access Oracle Internet Directory:

Let us assume that the private aliases and private distribution lists of individual users are also stored in the directory. Let us further make the assumptions in Table 20-2 that enable us to guess the size of the directory.

Table 20-2 Assumptions about Entry Types and Their Sizes

Entry Type Size

Total user population

40,000

Average number of private aliases for each person

10

Average number of private distribution lists for each person

10

Total number of public distribution lists

4000

Total number of public aliases in the company

1000

Number of attributes in each entry in the directory related to this application

20

Number of cataloged attributes

10


Based on these assumptions, we can derive the overall count of entries in Oracle Internet Directory as described in Table 20-3.

Table 20-3 Overall Count of Entries

Entry Type Size

User entries

40,000 (these represent the users themselves)

Private aliases of users

40,000 x 10 = 400,000 entries

Private distribution lists of users

40,000 x 10 = 400,000 entries

Company wide distribution lists

4000

Company wide aliases

1000


These assumptions yield a directory population of approximately one million entries. Given the user population and the directory population, let us then analyze usage patterns so that we can derive performance requirements from them. A typical user tends to send an average of 10 e-mails everyday and receives an average of 10 e-mails a day from the outside world. Assuming an average of five recipients for each e-mail sent by a user, this would result in five directory lookups for each e-mail.

Table 20-4 summarizes all the possible directory lookups that can happen in one day.

Table 20-4 Directory Lookups in a Single Day

Type of Directory Lookup Number of Directory Lookups In One Day

The Mail Transfer Agent (MTA) processing outbound mail from each user

5x10x40,000 = 2,000,000

The MTA processing mails from the outside world

10x40,000 = 400,000

All other directory lookups (like IMAP clients validating certain addresses, and so on)

800,000


To summarize: The total number of directory lookups everyday would be about 3,200,000 (3.2 million). If these lookups were spread out uniformly throughout the day, it would require about 37 directory lookups every second (133,333 lookups every hour). Unfortunately, we will never have this case.

Usage analysis of the current e-mail system over a period of 24 hours shows the pattern illustrated in Figure 20-1.

Figure 20-1 Usage Analysis of Current E-mail System

Description of Figure 20-1  follows
Description of "Figure 20-1 Usage Analysis of Current E-mail System"

The e-mail system and Oracle Internet Directory are maximally stressed in the mornings. There are other usage peaks as well: one close to lunch time, and one near the end of business day. However, it is in the mornings that the Oracle Internet Directory is stressed the most.

Let us assume that 90 percent of all the directory lookups happen during normal working hours. Table 20-5 shows the shift load for the morning, afternoon, and evening periods of an eight hour day.

Table 20-5 Working Hour Loads

Shift Load Lookups

Morning load

65%: 0.90 x 0.65 x 3,200,000 = 1,872,000 lookups for 2 hours (936,000 lookups every hour)

Afternoon load

10%: 0.90 x 0.10 x 3,200,000 = 288,000 lookups for 1 hour (288,000 lookups every hour)

Evening load

20%: 0.90 x 0.20 x 3,200,000 = 576,000 lookups for 2 hours (288,000 lookups every hour)


These calculations indicate that the directory in this case should be designed to handle the peak load of 936,000 lookups every hour.

Now that we know the data-set size as well as the performance requirements, we can look into individual components of the installation and estimate good values for each.

20.3 I/O Subsystem Requirements

This section contains these topics:

20.3.1 About the I/O Subsystem

The I/O subsystem can be compared to a pump that sends data to the CPUs to enable them to execute workloads. The I/O subsystem is also responsible for data storage. The main components of an I/O subsystem are arrays of disk drives controlled by disk controllers.

It is important to consider performance requirements when you size the I/O subsystem, rather than size based only on storage requirements. Although disk drives have increased in size, the throughput—that is, the rate at which the disk drive pumps data—has not increased in proportion. In sizing calculations for the I/O subsystem, you should use the following factors as input:

  • The size of the database

  • The number of CPUs on the system

  • An initial estimation of the workload on the Oracle Internet Directory

  • The rate at which the disk can pump data

  • Space needed to stage data prior to load

  • Space needed for index creation and sort activities

Given a range of I/O subsystems, you should always opt for the highest throughput drives. Typically, one can maximize the I/O throughput by one or more of the following techniques:

  • Striping logical volumes so that the I/O operations use multiple disk spindles

  • Putting different tablespaces in different logical and physical disk volumes

  • Distributing the disk volumes on multiple I/O controllers

Some guidelines for organizing Oracle Internet Directory-specific data files are provided in Chapter 21, "Tuning Considerations for the Directory". Depending on the tolerance of disk failures, different levels of Redundant Arrays of Inexpensive Disks (RAID) can also be considered.

Assuming that the decision has been made to get the best possible I/O subsystem, we focus the next section on deriving sizing estimates for the disks themselves.

20.3.2 Rough Estimates of Disk Space Requirements

You can use Table 20-6 to derive a rough estimate of the overall disk requirement.

Table 20-6 Disk Space Requirements

Number of Entries in DIT Disk Requirements

100,000

450MB to 650MB

200,000

850MB to 1.5GB

500,000

2.5GB to 3.5GB

1,000,000

4.5GB to 6.5GB

1,500,000

6.5GB to 10GB

2,000,000

9GB to 13GB


The data shown in Table 20-6 makes the following assumptions:

  • There are about 20 cataloged attributes.

  • There are about 25 attributes for each entry.

  • The average size of an attribute is about 30 bytes.

Going back to our example of Acme Corporation, since our directory population is about one million, this would imply that our disk requirements are approximately 4.5 GB to 6.5 GB. Note that the assumptions made for Acme Corporation regarding the number of cataloged attributes are different, but the previous table should give an approximate figure of the size requirements.

Since the directory may be deployed for a wide variety of applications, these assumptions need not necessarily hold true for all possible situations: There might be cases where the size of attributes is large, the number of attributes for each entry is large, extensive use of ACIs has been made, or the number of cataloged attributes is very high. For such cases, we present simple arithmetic procedures in the following section which will allow the planners to get a more detailed perspective of their disk requirements.

20.3.3 Detailed Calculations of Disk Space Requirements

Because Oracle Internet Directory stores all of its data in an Oracle Database, the sizing for disk space is primarily a sizing of the underlying database. Oracle Internet Directory stores its data in the tablespaces described in Table 20-7.

Table 20-7 Tablespaces Used to Store Oracle Internet Directory Data

Tablespace Name Contents

OLTS_ATTR_STORE

Stores all of the attributes for all entries in the DIT

OLTS_CT_STORE

Stores all the remaining (including user-defined) catalogs and the indexes defined in the catalogs

OLTS_DEFAULT

Stores all of the data pertaining to the administration of the Oracle Internet Directory as well as the data used for replication support

OLTS_SVRMGSTORE

Stores all the tables and indexes required for Oracle Internet Directory Server Manageability

SYSTEM

Required by Oracle Database for various book-keeping purposes. Typically, its size remains constant at about 300MB.


This section presents simple arithmetic procedures to determine the size requirements of each of the tablespaces referred to in Table 20-7. All of the size calculations are based on the variables in Table 20-8.

Table 20-8 Variables Used for Size Calculation

Variable Name Description

num_entries

Total number of entries in the directory

attrs_per_entry

Average number of attributes for each directory entry

avg_attr_size

Average size of the attribute value in bytes

avg_dn_size

Average size of the DN of an attribute in bytes

objectclass_per_entry

Average number of object classes that an entry belongs to

objectclass_size

Average size of the name of each objectclass in bytes

num_cataloged_attrs

Number of cataloged attributes used in the entries

entries_per_catalog

Average number of entries for each catalog table. This is required because not all cataloged attributes will be present in all entries in the DIT.

change_log_capacity

Number of changes that we wish to buffer for replication purposes

num_acis

Overall number of ACIs in the directory

num_auditlog_entries

Number of auditlog entries to store in the directory

db_storage_ovhd

Overhead of storing data in tables. This overhead corresponds to the relational constructs as well as operating system specific overhead. A value of 1.3 for this variable would represent a 30 percent overhead. The minimum value for this variable is 1.

db_index_ovhd

Overhead of storing data in indexes. This overhead corresponds to the relational constructs as well as the operating system specific overhead. A value of 5 for this variable would represent a 400 percent overhead. The minimum value of this variable is 1.

factor_of_safety

Multiplier for accommodating growth and errors in calculations. A value of 1.3 for this variable would represent a 30 percent factor of safety. The minimum value for this variable is 1.

initial_num_entries

Total number of entries that are initially bulk-loaded into the directory

avg_attrname_len

Average size of attribute name, in bytes

num_stats_entries

Number of statistics entries generated by OID Server Manageability when the host DSF attribute 'orclstatsflag' is enables

attrs_per_stats_entry

Average number of attributes for each statistics entry


Using the variables shown in Table 20-8, the size of individual tablespaces can be calculated as shown in Table 20-9.

Table 20-9 Size of Individual Tablespaces

Tablespaces Containing Tables Formula

ATTRSTORE_INDEX_SIZE

num_entries*(attrs_per_entry+6) *10

CATALOG_INDEX_SIZE

entries_per_catalog*num_cataloged_attrs*avg_attr_size*db_index_ovhd +num_entries*objectclass_per_entry*objectclass_size*db_index_ovhd + num_acis*1.5*avg_dn_size*db_index_ovhd + num_auditlog_entries*2*avg_dn_size*db_index_ovhd

CN_SIZE

num_entries*avg_dn_size*db_storage_ovhd

DN_INDEX_SIZE

num_entries*2*(avg_dn_size * 3)

DN_SIZE

num_entries*2*(avg_dn_size+4)

OBJECTCLASSES_SIZE

num_entries*objectclass_per_entry*objectclass_size*db_storage_ovhd + num_auditlog_entries*2*avg_dn_size*db_storage_ovhd

OLTS_ATTR_STORE

(num_entries*(((attrs_per_entry)*(avg_attrname_len+avg_attr_size+22))+6*35)*db_storage_ovhd)+attrstore_index_size

OLTS_BATTRSTORE

6M+(((num_binary_attrs*avg_binval_length)+6*35)*db_storage_ovhd)

OLTS_CT_STORE

(cn_size+objectclasses_size+dn_size+catalog_index_size+dn_index_size)

OLTS_DEFAULT

(change_log_capacity*4*avg_attr_size*db_storage_ovhd*db_index_ovhd) + (initial_num_entries*2*(avg_dn_size+4))

OLTS_SVRMGSTORE

2M+num_stats_entries*((avg_attrname_len+avg_attr_size+20)*(2*attrs_per_stats_entry)*db_storage_ovhd*(orclstatsperiodicity/10)*12)

SYSTEM

300MB


Use the arithmetic operations shown in the preceding table to compute the exact space requirements for a wide variety of Oracle Internet Directory deployment scenarios. The sum of the sizes of each of the tablespaces should yield the overall database disk requirement. One can optionally multiply that by the "factor_of_safety" variable to get a figure that can compensate for unforeseen circumstances.

Going back to our example of Acme Corporation, we can assign values to each of the variables based on the requirements stated in previous sections. Table 20-10 illustrates the values of each variable introduced in this section for Acme Corporation.

Table 20-10 Values for Variables Used for Sizing Calculations

Variable Name Value

num_entries

1,000,000

attrs_per_entry

20

avg_attr_size

32 bytes

avg_dn_size

40 bytes

objectclass_per_entry

5 (each entry belongs to an average of 5 object classes)

objectclass_size

10 bytes

num_cataloged_attrs

10

entries_per_catalog

1,000,000

change_log_capacity

80,000 changes (2 for each user)

num_acis

80,000 ACIs (2 for each user)

num_auditlog_entries

1000

db_storage_ovhd

1.4 (40% overhead)

db_index_ovhd

5.0 (400% overhead)

factor_of_safety

1.5 (50% factor of safety)

initial_num_entries

1,000,000

num_stats_entries

5

attrs_per_stats_entry

12

orclstatsperiodicity

60 (root DSE attribute)

avg_attrname_len

6


If we now plug these values into the equations described earlier, we get the values listed in Table 20-11.

Table 20-11 Tablespace Sizes

Tablespace Name Size in Bytes Size in MB

OLTS_ATTRSTORE

2,223,000,000

2182

OLTS_CT_STORE

2,328,512,000

274

OLTS_DEFAULT

159,680,000

156

OLTS_SVRMGSTORE

2,701,568

3

SYSTEM

314572800

300

Total Size

5038093862

4920


Table 20-11 shows that the estimated size of the database for Acme Corporation would be about 8.25 GB. If all of the data is being loaded in bulk, then the bulkload tool of Oracle Internet Directory would require an additional 30 percent of space occupied by the database to store its temporary files. For Acme Corporation, this would add about 2.5 GB to the total space requirement.

20.4 Memory Requirements

Memory is used for a number of distinct tasks by any database application, including Oracle Internet Directory. If memory resources are insufficient for any of these tasks, then the CPUs work less efficiently and system performance drops. Furthermore, memory usage increases in proportion to the number of concurrent connections to the database and the number of concurrent users of the directory. For the purposes of capacity planning, an active connection begins when a client seeks to bind to the directory and ends when that bind is terminated.

The memory available to processes comes from the virtual memory on the system, which is somewhat more than available physical memory. If the sum of all active memory usage exceeds the available physical memory on the system, the operating system may need to store some of the memory pages on disk. This is called paging. Paging can degrade performance if memory is too oversubscribed. Generally, you should not exceed 20 percent over-subscription of physical memory. If paging occurs, you need either to scale back memory usage by processes or to add more physical memory. Keep in mind the trade-offs: There are physical limits to the amount of memory you can add, but scaling back on memory usage for each process can significantly degrade performance.

The main consumers of memory are the database buffer cache within the system global area (SGA) and the OID Server Entry Cache (if enabled). Getting a good hit ratio for the buffer cache and the entry cache requires allocating enough memory in each area. The following formula gives a rough estimate for the amount of RAM required to cache 'N' entries in the entry cache:

N * [ 150+ {attrs_per_entry + 6) * (avg_attrname_len + avg_attr_size + 40) } ] * 1.3


See Also:

Chapter 21, "Tuning Considerations for the Directory" for further information on SGA tuning

Table 20-12 gives minimum memory requirements for different directory configurations.

Table 20-12 Minimum Memory Requirements for Different Directory Configurations

Directory Type Entry Count Minimum Memory

Small

Less than 600,000

512 MB

Medium

600,000 to 2,000,000

1 GB

Large

Greater than 2,000,000

2 GB


Going back to our example of Acme Corporation, the number of entries in the directory are close to 1,000,000 (1 million). Oracle Corporation recommends choosing the 2 GB option in order to maximize performance.

20.5 Network Requirements

The network is rarely a bottleneck in most installations. However serious consideration must be given to it during the capacity planning stage. If the clients do not get adequate network bandwidth to send and receive messages from Oracle Internet Directory, the overall throughput will seem to be very low. For example, if we have configured Oracle Internet Directory to service 800 search operations every second, but the computer running the Oracle directory server is only accessible through a 10 Mbps network (10-Base-T switched ethernet), and we have only 60 percent of the bandwidth available, then the clients will only see a throughput of 600 search operations a second (assuming each search operation causes 1024 bytes to be transferred on the network). Table 20-13 shows the maximum possible throughput (in operations every second) for two types of operations (one requiring a transfer of 1024 bytes the other requiring a transfer of 2048 bytes) for two types of networks, 10 Mbps & 100 Mbps, at different rates of bandwidth availability.

Table 20-13 Maximum Possible Throughput for Two Types of Operations

Percent Available Bandwidth Operations/sec1024 bytes Operations/sec2048 bytes
10 Mbps 100 Mbps 10 Mbps 100 Mbps

30

300

3000

150

1500

40

400

4000

200

2000

50

500

5000

250

2500

60

600

6000

300

3000

70

700

7000

350

3500

80

800

8000

400

4000

90

900

9000

450

4500


In some cases, it may also be important to consider the network latency of sending a message from a client to the Oracle directory server. In some WAN implementations, the network latencies may become as high as 500 milliseconds, which may cause the clients to time out for certain operations. In summary, given a range of networking options, the preferred choice should always be for highest bandwidth, lowest latency network.

Going back to the example of Acme Corporation, their peak usage rate is 936,000 lookups every hour which results in an equivalent number of lookup operations to the directory. This requires about 260 directory operations every second. Assuming that each operation results in a transfer of 2 KB of data on the network, this would imply that we should have a 100 Mbps network or at least 60 percent bandwidth available on a 10 Mbps network. Since the 100 Mbps network will typically have a lower latency, we will chose that over the 10 Mbps network.

20.6 CPU Requirements

This section contains these topics:

20.6.1 CPU Configuration

The CPU sizing for Oracle Internet Directory is directly a function of the user workload. The following factors will determine CPU configuration:

  • The number of concurrent operations you want to support. This will be directly dependent on the number of users performing operations simultaneously.

  • The acceptable latency of each operation. For example, in an e-mail application, a latency for each operation of 100 milliseconds might be desirable, but in most cases a latency of 500 milliseconds might still be acceptable.

CPU resources can be added to a system as the workload increases, but these additions seldom bring linear scalability to all operations since a lot of operations are not purely CPU bound. We classify the processing power of a computer by a performance characteristic that is commonly available from all vendors, namely, SPECint_rate95 baseline. This number is derived from a set of integer tests and is available from all system vendors as well as the SPEC Web site (http://www.spec.org).


Note:

SPECint_rate95 should not be confused with the regular SPECint95 performance number. The SPECint95 performance number gives an idea of the integer processing power of a particular CPU (for systems with multiple CPUs, this number is typically normalized). The SPECint_rate95 gives the integer processing power of an entire system without any normalization.

Because Oracle Internet Directory makes efficient use of multiple CPUs on an SMP computer, we chose to categorize computers based on their SPECint_rate95 numbers. Even within SPECint_rate95 we chose the baseline number as opposed to the commonly advertised result. This is because the commonly advertised result is actually the peak performance of a computer, whereas the baseline number represents the performance in normal circumstances.

20.6.2 Rough Estimates of CPU Requirements

Since Oracle Internet Directory typically co-resides with the Oracle Database, we recommend at least a two-CPU system. We give the rough estimates in Table 20-14 based on the level of usage of Oracle Internet Directory.

Table 20-14 Rough Estimates of CPU Requirements

Usage Num CPUs SPECint_rate95 baseline System

Departmental

2

60 to 200

Compaq AlphaServer 8400 5/300 (300Mhz x 2)

Organization wide

4

200 to 350

IBM RS/6000 J50 (200MHz x 4)

Enterprise wide

4+

350+

Sun Ultra 450 (296 MHz x 4)


20.6.3 Detailed Calculations of CPU Requirements

It is difficult to determine the CPU requirements for all operations at a given deployment site since the amount of CPU consumed depends upon several factors, such as:

  • The type operation: base search, subtree search, modify, add, and so on

  • If SSL mode is enabled or not, since SSL consumes an additional 15 to 20 percent of CPU resources.

  • If Oracle Internet Directory server entry cache is enabled or not, since the hit ratio affects CPU usage.

  • The number of entries returned for a search

  • The number of access control policies that need to be checked as part of a search

In most of the cases, except SSL, we can expect that there is a large latency between the Oracle Internet Directory server process and the database. When a thread in the Oracle Internet Directory server process is waiting for the database to respond, other threads within the Oracle Internet Directory server process can be put to work by other client requests needing LDAP server specific processing. As a result, for any mix of operations, one can always come up with a combination of concurrent clients and Oracle Internet Directory server processes that will result in 100 percent CPU utilization. In this case, the CPU becomes the bottleneck.

Given this fact, we have taken a 'messaging' type of subtree search operation and tried to estimate the CPU resources need to support a given number of concurrent operations without degrading the throughput of operations. The 'messaging' search operation involves subtree scope, a simple exact match filter and a result set of one entry. For Oracle Internet Directory 10g Release 2 (10.1.2):

SPECint_rate95 baseline = 0.5 * (max # of concurrent operations at peak throughput)

This means that, if we need to support 600 concurrent clients without degrading the throughput of operations, then we need a computer that has at least a SPECint_rate95 baseline rating of (0.5 * 600) = 300.

In terms of throughput of operations, for Oracle Internet Directory 10g Release 2 (10.1.2):

SPECint_rate95 baseline = 0.4 * (throughput of operations at max supported concurrency)

What this means is that if we need a throughput of 750 operations every second for the given maximum number of supported concurrent operations, then we need a computer that has at least a SPECint_rate95 baseline rating of (0.4 * 750) = 300.

It has been proven that Oracle Internet Directory scales very well with additional CPU resources. What this means is:

  • For a given concurrency of operations, we can achieve higher throughput of operations (and hence, a lower latency) by adding additional CPU resources.

  • For a given throughput of operations (and latency), we can support higher concurrency of operations by adding additional CPU resources.

Going back to our example of Acme Corporation, let us assume that we want adequate CPU resources to support 500 concurrent 'messaging' type of subtree search operations with each client seeing subsecond latency. Taking a factor of safety of 20 percent, our preliminary estimate of CPU requirement would be a computer with a SPECint_rate95 baseline of at least 360.

20.7 Summary of Capacity Plan for Acme Corporation

In the preceding sections, we have described various components involved in capacity planning and have also shown how each of them would apply to an Oracle Internet Directory deployment at a hypothetical company named Acme Corporation. In this section we give a quick summary of all of the recommendations made. Following were the initial assumptions:

Based on these requirements and further assumptions, we developed the following recommendations:

Several simplifying assumptions were made so that the sizing calculations could be more intuitive.