The Throttling filter enables to limit the number of requests that pass in a specified time period. For example, this enables you to enforce a specified message quota or rate limit on a client, and to protect a backend service from message flooding. You can configure this filter to allow only a specified number of messages from a specified client over a configured time frame through to a virtualized API. If the number of messages exceeds the specified limit, the filter fails for the excess messages.

[Important] Important

The Throttling filter succeeds for incoming messages that meet the specified constraints. For example, if the filter is configured to allow 20 messages through per second, it fails on message 21, but passes for the first 20 incoming messages.

When the configured constraints are breached, the API Gateway behavior is determined by the filter next in the policy failure path from the Throttling filter. Typically, an Alert, Trace, or Log filter is configured as the successor filter in the failure path.

An example use case for this filter would be to enforce a message rate limit specified in a contract agreed with a customer (for example, the customer has purchased a maximum of 100 messages per client per hour only). Another example use case would be to protect a service that can handle a maximum of only 20 messages per client per second. If the filter detects a higher number of incoming requests, it blocks the messages.

Rate limit settings

Configure the following rate limit settings:


Enter an appropriate name for the filter.


Specifies the number of messages allowed in the time interval specified in the Messages every field. If the API Gateway receives more than the specified number of messages during the specified time interval, the filter fails. Otherwise, the filter passes.

Messages every:

Specifies the allowed time interval. If the API Gateway receives more than the number of messages in the Allow field in the time interval specified, the filter fails. Otherwise, the filter passes. The time interval depends on the value selected in the drop-down box on the right (seconds, minutes, hours, days, or weeks). For example, if you enter 10 in the Messages every field, and select Minutes from the drop-down list, the time interval lasts 10 minutes.

[Note] Note

The specified time period starts when a message is received, and lasts for that time period (for example, 10 minutes). When the time period is over, the message count is reset, and the counter starts again when another message is received. If you select Day or Week from the drop-down list, you must configure the When do days/weeks start fields on the Advanced tab.

Rate limit based on:

Select this setting if you wish to configure the API Gateway to keep track of request messages based on a specified key value (for example, to track rates based on client IP address). All rate limits are stored in a cache, and a key is required to look up the cache. The key can be any value that identifies the message sender. For example, this includes the authenticated subject, the client IP address, or even a combination of API method name and IP address. If you do not specify a key, the cache will contain a single entry for the filter, and the associated rate limit is used each time the filter is invoked.

To track rate limits based on the client IP address, use the following setting:


The value entered can also be a combination of a fixed string value and/or an API Gateway message attribute selector. For example, you could use the following value to keep track of the number of times an API named StockQuote is requested by an authenticated subject:


Store rate limits in cache:

In cases where multiple API Gateways are deployed for load balancing purposes, and you want to maintain a single count of all messages processed by all API Gateway instances, you can configure a distributed cache to cache request messages.

For example, you wish to prevent a burst of more than 50 messages per second from reaching a backend service. Assume that a load balancer is deployed in front of two API Gateway instances, and round-robins requests between these two instances. By caching request messages in a global distributed cache, which is inherently replicated across all API Gateway instances, the Throttling filter can compute the total number of messages in the distributed cache, and therefore the total number of messages processed by all API Gateway instances.

The Throttling filter uses the pre-configured Local maximum messages cache by default. To configure a different cache, click the button on the right, and select from the list of currently configured caches in the tree. To add a cache, right-click the Caches tree node, and select Add Local Cache or Add Distributed Cache. Alternatively, you can configure caches under the Libraries node in the Policy Studio tree. For more details, see the topic on Global caches.

Advanced settings

Configure the following advanced settings:

A day starts on the following hour:

You must configure this field if you select either Day or Week from the drop-down list when configuring the Messages every field. For example, if you select Day, and enter 00:00 in this field, this means that only the specified number of messages can be received in a one day period starting from midnight tonight until midnight the next day.

A week starts on the following day:

You must configure this field only if you select Week from the drop-down list when configuring the Messages every field. For example, if you select Week and 00:00, and enter Sunday in this field, this means that the time period starts next Sunday at midnight, and lasts for one week exactly. The time period is reset on midnight of the next Sunday.

Include remaining limit in HTTP response headers:

Specifies whether to include the following X-Rate-Limit headers in the HTTP response message:

X-Rate-Limit-Limit Rate limit ceiling for the given request (for example, 100 messages).
X-Rate-Limit-Remaining Number of requests left for the time window (for example, 45 messages).
X-Rate-Limit-Reset Remaining time window before the rate limit resets in UTC epoch seconds (for example, 1353517297).

Begin rate limit HTTP Headers with:

Specifies the string used to begin the name of rate limit HTTP headers. Defaults to X-Rate-Limit-. For example, if you specify a value of My-Corp-Quota-, the following HTTP headers are inserted into the response message:

  • My-Corp-Quota-Limit

  • My-Corp-Quota-Remaining

  • My-Corp-Quota-Reset

Use multiple throttling filters

To use two or more Throttling filters to maintain separate message counts, you must use a different rate limit value for each filter, or use different caches for each filter:

  • Use a different rate limit per filter:

    With this approach, you can use a unique Rate limit based on value in each Throttling filter. The easiest way to do this is to prepend the ${http.request.clientaddr.getAddress()} selector value with the filter name, for example:

    My Corp Quota Filter ${http.request.clientaddr.getAddress()}

    This ensures that each filter maintains its own separate message count in the selected cache.

  • Use a unique cache per filter:

    Alternatively, you can use a unique cache to store the message count of each Throttling filter. With this solution, you must configure a separate cache for each Throttling filter that you have configured throughout all policies running on the API Gateway.