Chapter 3 Planning Your Sizing Strategy

When you design your deployment, you must decide how to configure your Instant Messaging server to provide optimum performance, scalability, and reliability.

Sizing is an important part of this effort. The sizing process allows you to identify what hardware and software resources are needed so that you can deliver your desired level of service or response time according to the estimated workload that your Instant Messaging server users generate. It is an iterative effort.

This chapter introduces the basics of sizing your Instant Messaging deployment to enable you to obtain the right sizing data by which you can make deployment decisions. It also provides the context and rationale for the Instant Messaging sizing process.

Since each deployment has its own set of unique features, this chapter will not provide detailed Instant Messaging sizing information for your specific site or sizing information for servers with which Instant Messaging interoperates, such as LDAP, SMTP, etc. Rather, this chapter explains what you need to consider when you architect your sizing plan, and provides general guidelines specifically for Instant Messaging components you can modify to suit your site’s needs. Work with your Sun ONE technical representative for your deployment hardware and software needs.

Collecting Sizing Data

Using a Load Simulator

System Performance Guidelines

Developing Architectural Strategies

Example Resource Requirements

Collecting Sizing Data

Use this section to identify the data you need in order to size your Instant Messaging deployment. The following topics are covered in this section:

Determine Peak Volume of Unique Logins

Create Your Usage Profile

Define Your User Base or Site Profile

Determine Peak Volume of Unique Logins

Your peak volume is the largest concentrated numbers of unique logins to your Instant Messaging system within a given period in a day. The volume can vary from site to site as well as across different classes of users. For example, peak volume among groups may occur during corporate-held core hours which differ between time zones.

Determine when and for how long the peaks occur.

Size your deployment against peak volume load assumptions.

Once patterns are analyzed, choices can be made to help the system handle the load and provide the services that users demand.

When you determine the peak volume on your system, be sure that your Instant Messaging deployment can support it.

Create Your Usage Profile

Measuring your load is important for accurate sizing. Your usage profile determines the factors that programs and processes place on your Instant Messaging servers and multiplexors.

This section helps you create your usage profile to measure the amount of load that is placed on your deployment.

What is the total number of users on your system?

When counting the number of users on your system, account for not only the users who have accounts and can log into the system, but also the users with accounts who are currently not logged into the system. Table 3-1 describes the types of users that make up the total.

Characterize your configured users using these three general profiles. The total of these users should give you an idea of the total number of concurrent connections you need to support.

Table 3-1 Active Versus Inactive User
Connection	Description
Inactive User	A user with an Instant Messaging account who currently is not logged into the system. Non-connected users consume disk space but no CPU or memory.
Connected/Inactive	These users are logged in, but are not currently sending or receiving instant messages.
Connected/Active	Logged into the system and actively sending messages, updating user information such as contact lists, and attending conferences throughout the day.

How many connections are on your system during your peak volume?

Correctly formulating the maximum number of concurrent users that has to be sustained by the system is key to planning your resource requirements. Although a deployment usually has maximum number of configured users, it is important to plan for the maximum number of concurrent users (connected and more or less active). A conservative estimate for the number of concurrent users can then be determined based on a 1:10 ratio. Thus, for a deployment of 50,000 configured users, the concurrent users would be 5,000.

Specifically, note the number of concurrent connections, idle, and busy connections.

Table 3-2 Client Connections
Connection	Description
Concurrent Connection	Number of unique TCP connections or sessions that are established on your system at any given time.
Idle Connection	Connection where no information is being sent between the client and multiplexor or server and multiplexor.
Busy Connection	A connection that is in progress. An established connection where information is being sent between the client and multiplexor or multiplexor and server.

To determine the number of concurrent connections in your deployment, you can either:

Count the number of established TCP connections by using the netstat command on Unix platforms.

Obtain the last login and logout times for Instant Messenger users.

To determine the number of concurrent connections you can support, you need to obtain two values from parameters within the iim.conf file that are used for tuning multiplexor performance:

iim_mux.numinstances - Specifies the number of multiplexor instances.

iim_mux.maxsessions - Specifies the maximum number of clients that one mutliplexor process can handle. The default is 1000.

Once you have obtained these values, multiply the numinstances number by the maxsessions number. This gives the total number of concurrent connections supported by your deployment. For information on locating the iim.conf file, see the Sun ONE Instant Messaging Administrator’s Guide.

If you have a large deployment, how will you organize your users?

For example, consider placing active users and inactive users together on separate machines from one another.

If an inactive user becomes an active user, that user can be moved to the active user machines. This approach could decrease the amount of needed hardware, rather than placing inactive and active users together on a machine.

What is the amount of storage used for each user?

If you are not storing your end user data, such as contact lists, in LDAP, you need to plan for the space required to store this data. If you configure the server to store this data outside LDAP, the server stores it in a flat file. See the Sun ONE Instant Messaging Administrator’s Guide for more information.

How many messages enter your Instant Messaging system from the Internet?

The number of messages should be measured in messages per second during your peak volume.

How many messages are sent by your users to:

end users on your system?

the Internet?

This number of messages is also measured in messages per second during the peak volume.

Will you be using Secure Shared Layer (SSL)? If yes, what percentage of users and what type of users?

For example, in a particular organization, 20% of connections during peak hours will enable SSL.

Answering these questions provides a preliminary usage profile for your deployment. You can refine your usage profile as your Instant Messaging needs change.

Additional Questions

While the following questions are not applicable to creating your usage profile, they are important to developing your sizing strategy. How you answer these questions may require you to consider additional hardware.

How much redundancy do you want in your deployment?

For example, do you need to consider high availability.

What backup and restore strategy do you have in place (such as disaster recovery and site failover)? What are the expected times to accomplish recovery tasks?

Typically you need to back up the server configuration files, database, and any resource files you have customized.

Define Your User Base or Site Profile

Once you establish a usage profile, compare it to sample pre-defined user bases that are described in this section. A user base is made up of the types of Instant Messaging operations that your users will perform. Instant Messaging users fall into one of these user bases:

Casual Users

Heavy Users

The sample user bases described in this section broadly generalize user behavior. Your particular usage profile may not exactly match the user bases; you will be able to adjust these differences when you run your load simulator (as described in Using a Load Simulator).

Casual Users

A lightweight user base typically consists of users with simple Instant Messaging requirements. These users rarely initiate chat sessions and rarely receive invitations. They may only use Instant Messaging as a presence tool.

Heavy Users

A heavy user uses significantly more system resources than a casual user. Typical usage for a this type of user may be something like the following:

Presence updates equal to or greater than 20 times a day.

Contact list contains about 30 contacts.

Subscribes to the presence updates of all the contacts in the contact list.

Sets up around 4 conferences or chats per day where each conference has 3 people in the conference room and lasts an average of 10 minutes, and a message is added to the conference every 1 -15 seconds.

Using a Load Simulator

A load simulator creates a peak volume environment and calibrates the amount of load placed on your servers. You can determine if you need to alter your hardware, throughput, or deployment architecture to meet your expected response time, without overloading your system. To use a load simulator:

Define the user base that you want to test (for example, casual users).

If necessary, adjust individual parameters to best match your usage profile.

Define the hardware that will be tested.

Run the load simulator and measure the maximum number of concurrent connections on the tested hardware with the user base.

Publish your results and compare those results with production deployments.

Repeat this process using different user bases and hardware until you get the response time that is within an acceptable range for your organization under peak load conditions.



Note	Contact Sun Professional Services for recommended load simulators and support.

System Performance Guidelines

Once you evaluate your hardware and user base with a load simulator, you need to assess your system performance. The following topics address methods by which you can improve your overall system performance:

Multiplexor Configuration Best Practices

Memory Utilization

Make sure you have an adequate amount of physical memory on each machine in your deployment. Additional physical memory improves performance and enables the server to operate at peak volume. With sufficient memory, Instant Messaging can operate efficiently without excessive swapping.

For most deployments, you need at least 256 MB of RAM. The amount of RAM needed depends on the number of concurrent client connections, and whether the server and multiplexor are deployed on the same host. For information about concurrent connections, see "Create Your Usage Profile". For information on hosting the server and multiplexor on the same host, see "Developing Architectural Strategies".

On Unix, you can set the amount of memory allocated to the server by modifying the iim.jvm.maxmemorysize parameter in the iim.conf file. This parameter specifies the maximum number of megabytes of memory that the JVM running the server is allowed to use. The default setting is 256 MB, and the maximum setting is 500 MB. For instructions on modifying this parameter, see the Sun ONE Instant Messaging Administrator’s Guide. You cannot currently change this value on Windows NT.

Disk Throughput

Disk throughput is the amount of data that your system can transfer from memory to disk and from disk to memory. The rate at which this data can be transferred is critical to the performance of Instant Messaging. To improve efficiency in your system’s disk throughput:

Consider your maintenance operations, and ensure you have enough bandwidth for backup. Backups can also affect network bandwidth, particularly remote backups. Private backup networks can be a more efficient alternative.

Carefully partition the data stores to improve throughput efficiency.

Stripe data across multiple disk spindles in order to speed up operations that retrieve data from disk.

Disk Capacity

When planning server system disk space, you need to be sure to include space for operating environment software, Instant Messaging software, and for any servers not currently in your network that need to be installed to support Instant Messaging (such as LDAP). Be sure to use an external disk array. In addition, user disk space needs to be allocated. Typically, this space is determined by your site’s policy. Typical installations will require:

Approximately 300 MB of free disk space for each server or multiplexor.

Approximately 5K of disk space for each user.

Additional space for the Instant Messaging archive.

Use Table 3 to determine server and multiplexor disk space sizing numbers whether archiving is enabled or disabled. The figures listed in the table were generated using a 400MHz Ultra Sparc II Processor.

Table 3 Server and Multiplexor Memory Disk Space Sizing for Concurrent Users
	Server Memory Consumption for Connected/Inactive Users	Server Memory Consumption for Connected/Active Users	Multiplexor Memory Consumption for Connected/Inactive Users	Multiplexor Memory Consumption for Connected/Active Users
Archive Disabled	8 MB +20 K per User	120 MB + 20 K per User	8 MB + 20 K per User	8MB + 28K per User
SSO/Portal/Archive enabled	100MB +25K per User	120MB +30K per User	8M+35K per user	8 MB +40K per user

Please refer to the product documentation for other servers interoperating with Instant Messaging for their specific requirements.

Network Throughput

Network throughput is the amount of data at a given time that can travel through your network between your client application and server.

To avoid bottlenecks, ensure that the network infrastructure can handle the load.

Partition your network.

To ensure that sufficient capacity exists for future expansion, don’t use theoretical maximum values when configuring your network.

Separate traffic flows on different network partitions to reduce collisions and to optimize bandwidth use.

CPU Resources

Enable enough CPU for your servers and multiplexing services. In addition, enable enough CPU for any RAID systems that you plan to use. If you intend to use archiving in your deployment, you need to take those space requirements into consideration as well.

Use Table 4 to help determine the number of CPUs your installation requires for optimum performance whether archive is enabled or disabled. The figures listed in the table were generated using a 400MHz Ultra Sparc II Processor.

Table 4 CPU Utilization Numbers
	Server CPU Utilization for Connected/Inactive Users	Server CPU Utilization for Connected/Active Users	Multiplexor CPU Utilization for Connected/Inactive Users	Multiplexor CPU Utilization for Connected/Active Users
Archive Disabled	Several hundred thousand users per CPU	30 K users per CPU	50 K users per CPU	5 K users per CPU

Multiplexor Configuration Best Practices

Consider the following suggestions and generalizations when planning your multiplexor deployment. The parameters discussed in this section are located in the iim.conf file. See the Sun ONE Instant Messaging Administrator’s Guide for more detailed information about these parameters.

The number of iim_mux.maxthreads should not exceed the number of CPUs on your server.

The iim_mux.maxsessions value should be high enough to avoid rejecting connections, but it should be reasonable enough so that the multiplexor processes to not get overloaded.

Be sure that your expected number of concurrent client connections is less than the maximum possible by a safe margin.

Do not configure threads or number of concurrent sessions to more than you require. Otherwise, you will unnecessarily consume system resources.

A good starting point is to configure iim_mux.numinstances to the number of CPUs on the system.

Developing Architectural Strategies

Once you have identified your system performance needs, the next step in sizing your Instant Messaging deployment is to size specific components based on your architectural decisions.

The following sections point out sizing considerations when you deploy two-tiered and one-tiered architectures:

Two-tiered Architecture

One-tiered Architecture

Two-tiered Architecture

A two-tiered architecture splits the Instant Messaging server deployment into two layers: an access layer and a data layer. In a simplified two-tiered deployment, you might add one or more multiplexors and servers to the access layer. The multiplexor acts as a proxy for users, and the relays messages to the Instant Messaging server. The data layer holds the Instant Messaging server database and Directory servers. Figure 3-1 shows a simplified two-tiered architecture.

Two-tiered architectures have advantages over one-tiered architectures that can impact your sizing decisions. Two-tiered architectures:

Are easier to maintain than one-tiered architectures.

Allow the offloading of load-intensive processes like SSL, message reprocessing.

Are easier for growth management and you can upgrade your system with limited overall downtime.

To Size Your Multiplexing Services

When you size your multiplexor, the calculation is based on your system load, particularly the number of concurrent connections the multiplexor needs to handle.

Add CPU or a hardware accelerator for SSL if appropriate.

Add memory to the machine if the multiplexor is being configured on it.

Account for Denial of Service.

Add capacity for load balancing and redundancy, if appropriate.

One or more of each type of machine should still handle peak load without a substantial impact to throughput or response time when you plan for redundancy in your deployment.

One-tiered Architecture

In a one-tiered architecture, there is no separation between access and data layers. The Instant Messaging server, multiplexor, and sometimes the Directory server are installed in one layer. Figure 3-2 illustrates the idea.

One-tiered architectures have lower up-front hardware costs than two-tiered architectures. However, if you choose a one-tier architecture, you need to allow for significant maintenance windows.

Add CPU for SSL, if necessary.

Account for Denial of Service attacks.

Add more disks for the increased number of client connections.

Add more disks for each multiplexor.

For specific instructions on sizing Instant Messaging components in one-tiered or two-tiered architectures, contact your Professional Services representative.

Example Resource Requirements

This section provides example resource distributions and recommended sizing information for the following two deployment types:

Small Deployment Sample Resource Requirements Numbers

Large Deployment Sample Resource Requirements Numbers

Small Deployment Sample Resource Requirements Numbers

For a small deployment with the server and multiplexor on a single server having 10,000 users with the following profile:

30% connected/active

20% connected/inactive

50% not connected

Large Deployment Sample Resource Requirements Numbers

5% connected/active

20% connected/inactive

75% not connected

The server memory requirements are 4 GB RAM on 2 CPUs. The multiplexor requirement is 4 GB RAM on 16 CPUs.