20 Introduction to Coherence Topics

Topics introduces a publish and subscribe messaging functionality in Oracle Coherence 14.1.1.0.0.

This chapter includes the following sections:

Topics Overview

The Topics API enables the building of data pipelines between loosely coupled producers and consumers.

One or more publishers publish a stream of values to a Topic. One to many subscribers consume the stream of values to a Topic. The Topic values are spread evenly across all Oracle Coherence data servers, enabling high throughput processing in a distributed and fault tolerant manner. Each direct subscriber to a Topic receives all values published to the Topic.

In publish and subscribe messaging, messages are delivered only to active subscribers of the Topic. Subscriber groups add queue processing capabilities to Topics. After a subscriber group is created on a Topic, all values sent to the Topic are accumulated for the subscriber group. One or more subscribers can subscribe to a subscriber group. For the stream of values accumulated by a subscriber group, each value is only delivered to one of its subscriber group members, thus enabling distributed, parallel processing of these values.

Note:

For examples of basic Topic operations, see Performing Basic Topic Publish and Subscribe Operations.

Figure 20-2 Topic Values in a Subscriber Group


Description of "Figure 20-2 Topic Values in a Subscriber Group"

About Channels

To scale publishers and subscribers while still guaranteeing message ordering, Coherence Topics introduces the concept of Channels. A Channel is similar in idea to how Coherence partitions data in distributed caches. To avoid confusion, the name partition is not reused. Channels are an important part of the operation of both publishers and subscribers.

A publisher is configured to control what ordering guarantees exist for the messages it publishes when they are received by subscribers. This is achieved by publishing messages to specific Channels. All messages published to a Channel will be received by subscribers in the order a publisher published them to that Channel. Messages published to a Channel by different publishers may be interleaved; there is no ordering guarantee across different publishers. Messages published to different Channels may be interleaved as they are received by subscribers, and may be received by different subscribers in a subscriber group.

The number of Channels that a Topic has allows publishers to scale better as they avoid contention that may occur with many publishers publishing to a single Channel. Message consumption can be scaled because multiple subscribers (in a group) will subscribe to different Channels, so scaling up receiving of messages, while maintaining order.

As new subscribers are created in the same subscriber group, the available Channels are allocated as fairly as possible among the subscribers. If there are more subscribers than Channels, then some subscribers will not be assigned a Channel and will not receive any messages. If a subscriber disconnects for some reason, such as failure or a heart-beat timeout, then its assigned Channels are reallocated to the remaining subscribers. It is possible to register listeners to be notified of Channel allocation changes.

The Channel count for a Topic is configurable, ideally a small prime (the default is 17). There are pros and cons with very small or very large Channel counts, depending on the application use case and what sort of scaling or ordering guarantees it requires.

About Position

Every element published to a Topic has a Position. A Position is an opaque data structure used by the underlying NamedTopic implementation to track the Position of an element in a channel and maintain ordering of messages.

Positions are then used by subscribers to track the elements that they have received, and when committing a Position to determine which preceding elements are also committed, and to then recover to the correct Position in the Topic when subscribers reconnect or recover from failure.

While a Position data structure is opaque, they are serializable, meaning that they can be stored into a separate data store by application code that wants to manually track message element processing. The combination of Channel and Position should be unique for each message element published and received.