Performance Considerations
Design your consumer for low latency, steady throughput, orderly processing, and fast recovery. The following practices help you keep up with production event rates while preserving correctness and minimizing cost.
Deploy Close to the Source
Place your consumer in the same cloud/region as the OPERA environment to minimize round‑trip latency and increase effective throughput. For best results, deploy on Oracle Cloud Infrastructure in‑region with your OPERA Cloud gateway.
Correct Payload Size to Reduce Bandwidth and CPU
- Filter at subscription time. Request only the fields you will use; include
primaryKeyand the metadata you need, but avoid wide payloads. - Use delta events where appropriate. If your logic needs only changed attributes, set
delta: trueto shrink messages and increase events/second. - Consider orchestration. If you primarily need the latest resource state, omit the
detailarray and fetch the resource on demand using theprimaryKey(see Orchestration). Account for REST rate limits when doing this. - If your streaming configuration predates OHIP 25.4, update it in the Developer Portal so only filtered business elements are emitted in payloads; this can dramatically reduce message size.
Design the Intake Path for At Least Once Ordered Processing
- Keep exactly one active subscriber per (appKey, chainCode, gateway) to preserve ordering.
- Use an in‑memory queue or lightweight log to absorb bursts; offload heavy work (I/O, downstream calls) to worker threads or services, so the WebSocket event loop stays unblocked.
- After successfully processing an event, persist both
offsetanduniqueEventId. Use them to resume (offset) and for idempotency/duplicate suppression. - Only persist an offset after side effects complete (writes, API calls). This guarantees at‑least‑once semantics without losing work.
Important: The server does not provide or require application‑level acknowledgments,
and there is no platform DLQ. Treat "successful processing + persisted offset" as your
acknowledgment. On failure, do not block the WebSocket loop; enqueue the event to your own
DLQ and continue draining, ordering, and preserving through your persisted
offset and uniqueEventId.
Offset and Idempotency (pseudocode)
onNext(event):
// fast path: enqueue for processing to avoid blocking websocket loop
queue.push(event)
workerLoop():
while queue.notEmpty():
e = queue.pop()
// idempotency guard: skip if we've already processed this uniqueEventId
if store.exists(e.metadata.uniqueEventId): continue
try:
businessWrite(e) // DB writes, downstream API calls, etc.
store.save({
offset: e.metadata.offset, // persist only after side effects succeed
uniqueEventId: e.metadata.uniqueEventId
})
catch err:
dlq.push({event: e, error: err})
// do NOT rethrow; continue draining to avoid backlog growth
Handle Bursts and Backpressure
- Expect burst delivery when messages are large or the backlog grows (see Backpressure Mode). Your consumer must be non‑blocking and able to drain a queue efficiently.
- During bursts, the server may defer replying to client
pings withponguntil the current burst completes; treat continuednextmessages as proof of life. See Backpressure Mode and Maintain a Healthy Connection and Token Lifecycle (see heading below). - If you consistently fall behind:
- Reduce payload width (use filters,
delta, omit thedetailarray, and orchestrate). - Improve locality (run closer to the gateway).
- Split load across applications (by event type or hotel subsets) while respecting the single‑consumer rule per stream and overall app limits (see Scaling Recommendations).
- Reduce payload width (use filters,
- Use asynchronous I/O and batched writes to downstream systems to increase throughput.
Maintain a Healthy Connection and Token Lifecycle
- Keep the connection persistent; reconnect promptly on drops.
- Send a
pingevery 15 seconds and respond topings withpongto avoid idle termination. - Server heartbeat timeout: If the server does not receive a
pongwithin 180 seconds, it will close the connection. See Subscribing and Consuming Events for heartbeat notes and Backpressure Mode. - Client-side timeout policy: The client should close and reconnect if a
pongis not observed in time, but the threshold must be dynamic. Measure latency by timestamping eachping(startTime) and its correspondingpong(receiveTime), compute RTT = receiveTime - startTime, and use a smoothed average (SRTT). A conservative policy is to trigger reconnect if nopongis seen within max (180s, 4 x SRTT + small jitter). Do not evaluate this timer while a backpressure burst is in progress. See Handle Bursts and Backpressure (see heading above) and Backpressure Mode. - Monitor token expiry (
exp) and proactively disconnect before the token expiry time, obtain a new token, and then re-connect either without specifyingoffsetor specify the persistedoffset. - If you disconnect and reconnect after 24 hours, you must supply the last
offsetto resume replay. Events are retained for 7 days.
Token Refresh and Reconnection (pseudocode)
const REFRESH_MARGIN_SECONDS = 120 // refresh ~2 minutes before expiry
onConnected():
startHeartbeat(15000 ms) // send ping every 15s and handle pong
heartbeatTick():
send({"type": "ping"})
maybeRefresh():
if token.expiresAt - now() <= REFRESH_MARGIN_SECONDS:
// Preemptive refresh sequence
send({"id": subscribeId, "type": "complete"}) // use same subscribe id
await serverClose()
sleep(10000 ms) // required 10s gap
token = fetchNewToken()
connectAndSubscribe({
chainCode,
// include offset if last connect was > 24h ago or you persisted it on shutdown
offset: persistedOffsetOrNull()
})
onClose(code, reason):
// Backoffs and recovery strategy
if code == 4409: sleep(120000 ms + jitter()) // single-consumer lock (~2m)
else if code == 4504: sleep(15000 ms) // service timeout
else if code == 4401: token = fetchNewToken() // unauthorized (e.g., expired)
reconnectWith(offset = persistedOffsetOrNull())
Respect Protocol and Gateway Limits
- Headers must fit within 8 KB (token + app key + x‑request‑id, and so on).
- Wait at least 10 seconds between sending
completeand the nextsubscribe. - If you hit 4409 (single‑consumer lock), wait for the minimum lockout (~2 minutes) plus jitter before retrying.
- Ensure outbound
wss://is allowed and not intercepted; avoid TLS inspection that can degrade performance or break WebSocket upgrades. Keep NAT/firewall idle timeouts higher than your heartbeat interval.
Observe, Measure, and Tune
- Track produced vs. consumed events, backlog/lag (offset distance), end‑to‑end latency (event timestamp to processing completion), throughput (events/sec), disconnect/close reasons, and error codes (4401/4403/4409/4504).
- Use the Developer Portal Analytics to visualize production vs. consumption and verify you are keeping up.
- Log
uniqueEventId,offset, andeventNamefor diagnostics; avoid logging PII fromdetail.
Cost and Efficiency (Partners)
Partners pay per event. Use filters and delta to cut noise, drop unnecessary fields, and condense to avoid duplicating orchestration calls.
Pre‑go‑live Performance Checklist
- Consumer is deployed in‑region with the OPERA gateway (or as close as feasible).
- Subscription filters are minimal;
deltamode used where appropriate. - Orchestration pattern validated with REST rate limits and retries.
- Single active subscriber enforced; passive standby uses status checks and jittered failover.
OffsetanduniqueEventIdpersisted after successful processing; idempotency verified.Ping/pongimplemented at ~15s; reconnect logic handles 4401/4403/4409/4504.- Header size 8 KB; min 10s between complete and new subscribe.
- Monitoring and alerting in place for lag, disconnects, and error codes.
- Smoke-tested end-to-end in UAT or sandbox: performed a simple REST write (for example, create a new profile) and confirmed the corresponding Business Event was received by the consumer.
Parent topic: Prerequisites