Understanding PeopleSoft MCF Server and Cluster Architecture

This section discusses PeopleSoft MCF server and cluster architecture.

PeopleSoft MCF depends on processes that are configured and booted as part of an application server domain. Configure MCF processes (servers) through PSADMIN, along with other processes in each application server domain.

Note: The Real-time Event Notification (REN) server process can be used by applications that are separate from the queue server and MCF log processes. In this case, you can configure the application server domain for event notification without creating the MCF servers.

After considering performance and failover issues, the MCF system administrator provides configuration information that describes the arrangement of queues, domains, queue server processes, REN server processes, MCF processes, and URL addresses.

REN server processes are configured on PeopleSoft Pure Internet Architecture pages before or after being initiated in an application server domain. Each queue server process is uniquely identified in the system by the combination of the machine name, domain subdirectory name, and process identifier. MCF log processes use the same queue-server identification scheme.

Both the queue server and the MCF log server are REN Java clients. If the REN server is configured to accept only SSL connections, then you must configure SSL certificates for the queue server and MCF log server.

See Installing Digital Certificates.

In a PeopleSoft system, all application server processes, including MCF servers and REN servers, belong to an application server domain. Each domain can have only one REN server process (PSRENSRV), one queue server process (PSUQSRV), and one MCF log server process (PSMCFLOG). Domains can be redundantly clustered to provide failover. Logical queues can be serviced by multiple clusters for scalability. Support for scalability and failover is integrated into the configuration process.

Image: PeopleSoft MultiChannel Framework cluster architecture

The following diagram illustrates MCF cluster architecture.

PeopleSoft MultiChannel Framework cluster architecture

The queue server is a server process in the PeopleSoft application domain that routes email, chat, and generic tasks to the agent based on the agent properties, such as state and skill set, and task routing properties, such as priority, language, and cost.

The queue server process (PSUQSRV) is a Tuxedo-managed server with a standard PeopleSoft database connection. Each queue server process is the central routing point for one or more physical queues. The queue server maintains state information for work requests, work in progress, agent availability, and agent workload. Queue server state is written to database records, except for the assignment of chat to agents.

The queue server can recover from a crash because most of its state is written to the database. When a queue server reboots, it checks the database and loads state information for open work tasks. The queue server connects to its REN server and issues restart queries to each console so that it can rebuild agent assignment information, which may have changed while the queue server was down.

Although a single queue server can recover state after recovering from software failure, this does not guard against hardware failure. Multiple queue server processes running on multiple host machines and configured in a cluster provide failover for hardware failure. Unlike the REN server cluster, the clustered queue servers operate as one master and many slaves. The master handles all routing decisions, while slave processes monitor the master and step in only if it fails. Any rebooted queue server rejoins the cluster in a slave role. Any slave that is promoted to master loads state from the database and issues queries to consoles as if it were the only process in the cluster.

Each queue server process follows a fixed procedure to ensure that the cluster has at most one functioning master. Database locks eliminate possible race conditions, and the master periodically writes a timestamp to indicate its health. The masterinterval parameter controls the frequency at which the master process must update the timestamp in the cluster table. The masterinterval parameter corresponds to the maximum time after a master queue server fails before another queue server process takes over. Minimizing this value provides rapid failover response time but also requires frequent database updates.

See Tuning Cluster Parameters.

Each queue server must be part of an MCF cluster, and each MCF cluster must include at least one queue server. An MCF cluster of only one queue server provides no redundancy against hardware failure.

Create a queue server that starts when an application domain is started by selecting MCF servers from the quick-configure menu during application domain configuration.

In summary, configure queue servers to provide hardware failover. Each queue server is part of an MCF cluster. To support hardware failover, distribute master and slave queue servers over multiple hosts. Every queue server in an MCF cluster communicates with the same REN server ID. Therefore, REN server failover is also crucial.

The first queue server that places a valid master entry for itself in the cluster table becomes the master queue server. In most cases, the master queue server is the first queue server started. No configuration parameter exists to designate master or slave queue server within a cluster.

After an MCF cluster's master queue server is established, all other cluster members become slave queue servers. If the slave queue servers within a cluster detect a failure of the master queue server, the remaining slave servers compete to become the master queue server. If the master queue server reboots before a slave takes over, the master queue server also competes. No configuration parameter exists to designate priority among slave servers.

See Understanding REN Servers.

PeopleSoft MultiChannel Framework enables the configuration of both logical and physical queues.

A logical queue is an application-level queue that receives work requests (tasks) relating to an application area, such as chat requests regarding sales information, and routes them to agents that are capable of handling the work. For example, you might configure a logical queue called SALES for sales inquiries and another called SUPPORT for support issues.

Logical queues can be partitioned into physical queues for scalability. A physical queue is managed by a single MCF cluster. For scalability, the tasks that are enqueued on a logical queue are distributed by the framework among all available physical queues. For example, the SALES queue could be serviced by four MCF clusters across four physical queues: SALES1, SALES2, SALES3, and SALES4.

Each agent can be assigned to only one physical queue within each logical queue. Each agent can be assigned to multiple logical queues.

The MCF log server (PSMCFLOG) is a Tuxedo-managed server that is similar to the queue server. Each MCF log server receives events that are sent by a REN server and is responsible for writing MCF events to the database.

The MCF log server logs events to PS_MCFUQEVENTLOG. By default, the log server does not log periodic state information broadcasts from the queue server to the MultiChannel Console. If you need to log these events, configure logging on the Cluster Tuning page. You can also configure the log server to log the contents of chat sessions. Chat session logging is deactivated by default. Logged chats are stored in PS_MCFCHATLOG.

If the MCF log server crashes, it resumes functioning immediately after restarting. When the first slave log server detects a failed master, it takes over as the master log server for the cluster. The new master log server again receives all base topics, but it does not log chat sessions that started or continued during the time that the original master log server failed. The new master log server does not log per-agent events for agents that were signed in at, or during, the time of the failure.

An MCF log server is created along with a queue server when you enable MCF servers during application server domain configuration. No specific log server configuration is available during domain configuration.

PeopleSoft MultiChannel Framework is scalable to support large-capacity call centers or other large organizations. The basic strategy is to divide the workload by spreading it over several MCF clusters. This is accomplished by creating multiple physical queues for each logical queue and spreading the management responsibility for each physical queue to separate queue server processes, preferably on multiple host machines. This technique should not be confused with failover protection, which also adds processes and machines. In failover, the added processes are clustered together and do not provide performance improvement.

Organize applications using PeopleSoft MultiChannel Framework around logical queues (for example, SALES queue and SUPPORT queue). Incoming work tasks are sent to a logical queue. PeopleSoft MultiChannel Framework then assigns the task to one of the corresponding physical queues. This assignment is random across the queues. The load across the servers is balanced by servicing only one physical queue per logical queue by single MCF cluster.

For example, a logical SUPPORT queue might be split into physical SUPPORT1 and SUPPORT2 queues such that work requests are randomly distributed between the two physical queues. Half the agents receive from one queue and half from the other. This splits the workload evenly between the two queue server processes, while still presenting one logical SUPPORT queue to the application.

Consider the following configuration options to ensure maximum reliability and scalability of your PeopleSoft MultiChannel Framework installation:

  • Configure multiple MCF servers in a cluster across multiple host machines.

    This provides protection against single-point failures.

    Each MCF cluster requires a REN server cluster. Configuring multiple REN server clusters is functionally the same as configuring multiple MCF clusters for scalability. Inside a REN server cluster, configuring multiple REN servers is functionally the same as multiple queue servers for failover.

    See Configuring PeopleSoft MCF Clusters.

  • Use REN server clusters only for failover.

    REN server clusters do not enhance performance.

  • Split logical queues into more than one physical queue if more work is required on that queue than a single process or machine can handle.

  • If an application server domain is likely to be restarted regularly for reasons that are not related to PeopleSoft MultiChannel Framework, configure PeopleSoft MultiChannel Framework in a separate domain.

    Regular restarting of MCF servers affects performance because the MCF servers must recover state when they are recycled or when a slave takes over from a master server.