Gathering Metrics
Oracle Health Insurance applications can gather the following metrics:
-
JVM memory consumption and processor metrics
-
Application-specific metrics like timers and counters for Dynamic Logic execution or Web Service request handling.
Metric data is exposed in Prometheus format. Prometheus can scrape metrics data using the "/prometheus" endpoint that Oracle Health Insurance applications expose.
Metric Data
The following table lists the metrics per the Oracle Health Insurance application:
Metric Name | Type | Description | Tags | Oracle Health Insurance Applications |
---|---|---|---|---|
ohi.dylo.timer |
timer |
Times Dynamic Logic executions |
code: dynamic
logic code |
All |
ohi.resource.client.timer |
timer |
Times requests to external REST resources |
resource: resource name |
All |
ohi.resource.timer |
timer |
Times handling of requests for the Oracle Health Insurance application’s HTTP API resources |
resource: resource name |
All |
ohi.task.dequeue |
counter |
Counts the number of tasks that were dequeued from the task queue |
Not applicable |
All |
ohi.task.enqueue |
counter |
Counts the number of tasks that were enqueued to the task queue |
Not applicable |
All |
ohi.task.timer |
timer |
Times task execution |
code: task type code |
All |
ohi.exchange.timer |
timer |
Times exchange execution |
integration: integration code |
Oracle Insurance Gateway |
Enable Metric Data recording
By default, metrics gathering is disabled. In that case the
"/prometheus" endpoint returns an HTTP 200 response without content.
To enable metrics gathering set the following system properties:
-
ohi.instrumentation.gather.jvmtelemetry
: set to true to enable recording of JVM telemetry. Requires restarting of the application to take effect. -
ohi.instrumentation.gather.applicationmetrics
: set to true to enable recording of metrics. Effective immediate, no restart required.
Record non-Oracle Health Insurance Metrics
Oracle Health Insurance applications record metrics for which the name starts with prefix
"ohi.". Metrics that may be published by non-Oracle Health Insurance components, that apply
different naming conventions, are not recorded. To enable recording of
non-Oracle Health Insurance metrics as well, set the value for system property
ohi.instrumentation.filter.ohi.nameprefix
to false.
Add Application Tag for All Oracle Health Insurance Metrics
An application-specific identifier for metrics is important when similar metrics are collected into one Prometheus instance from various Oracle Health Insurance applications. It allows filtering metric data on a per application basis. There are multiple ways to add a source application tag or label to any metric:
-
Configure Prometheus to add such a common tag or label to all metrics it collects.
-
Alternatively, configure an Oracle Health Insurance application to add an application identifier as a tag to all metrics by setting system property
ohi.instrumentation.common.application.tag
to true.
Configuring Timers
Timers are the most memory-consuming type of meter, and their total footprint can vary significantly depending on the selected options. This section lists configuration options for timers.
Histograms and Percentiles
As a rule of thumb, assume a percentile histogram for a timer to require
~8kB of memory. Note, that is the footprint for every combination of
meter name and tags.
By default, the timers configured in Oracle Health Insurance applications do not collect
percentile distributions or histogram data.
The two approaches for recording percentiles are:
-
Percentile histograms: by enabling this option, the application accumulates values to an underlying histogram and publishes a predetermined set of buckets to Prometheus. Calculate percentiles off of this histogram using the Prometheus query language.
To enable recording of percentile histograms, set system propertyohi.instrumentation.<timer>.histogram
, where the placeholder <timer> is the name of the timer, to true. -
Percentile distributions: by enabling this option, the application computes a percentile approximation for each meter ID, based on the set of name and tags, and publishes the percentile values to Prometheus. Note that this is not as flexible as using a percentile histogram because it is not possible to aggregate percentile approximations across tags.
To enable recording of percentile distributions, specify the percentiles as the value for system propertyohi.instrumentation.<timer>.percentiles
where the placeholder <timer> is the name of the timer.Percentiles must be specified as a comma-separated string. For example, set the median, 0.75 and 0.95 percentiles for the "ohi.resource.timer" as follows:ohi.instrumentation.ohi.resource.timer.percentiles=0.5,0.75,0.95
Example: verbatim "/prometheus" endpoint output for the "ohi.resource.timer" (for specific resource "/currentproperties") without percentile distributions or histogram data:
# HELP ohi_resource_timer_seconds_max Resource timer # TYPE ohi_resource_timer_seconds_max gauge ohi_resource_timer_seconds_max{method="GET",resource="/currentproperties",status="200",} 0.273524648 # HELP ohi_resource_timer_seconds Resource timer # TYPE ohi_resource_timer_seconds summary ohi_resource_timer_seconds_count{method="GET",resource="/currentproperties",status="200",} 5.0 ohi_resource_timer_seconds_sum{method="GET",resource="/currentproperties",status="200",} 0.399261975
Verbatim "/prometheus" endpoint output for the same "ohi.resource.timer" with histogram data enabled:
# HELP ohi_resource_timer_seconds_max Resource timer # TYPE ohi_resource_timer_seconds_max gauge ohi_resource_timer_seconds_max{method="GET",resource="/currentproperties",status="200",} 0.256794629 # HELP ohi_resource_timer_seconds Resource timer # TYPE ohi_resource_timer_seconds histogram ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.001",} 0.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.001048576",} 0.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.001398101",} 0.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.001747626",} 0.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.002097151",} 0.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.002446676",} 0.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.002796201",} 0.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.003145726",} 0.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.003495251",} 0.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.003844776",} 0.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.004194304",} 0.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.005592405",} 0.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.006990506",} 0.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.008388607",} 0.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.009786708",} 0.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.011184809",} 0.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.01258291",} 0.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.013981011",} 0.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.015379112",} 1.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.016777216",} 2.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.022369621",} 3.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.027962026",} 3.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.033554431",} 3.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.039146836",} 3.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.044739241",} 3.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.050331646",} 3.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.055924051",} 3.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.061516456",} 3.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.067108864",} 3.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.089478485",} 4.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.111848106",} 4.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.134217727",} 4.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.156587348",} 4.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.178956969",} 4.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.20132659",} 4.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.223696211",} 4.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.246065832",} 4.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.268435456",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.357913941",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.447392426",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.536870911",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.626349396",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.715827881",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.805306366",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.894784851",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="0.984263336",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="1.073741824",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="1.431655765",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="1.789569706",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="2.147483647",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="2.505397588",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="2.863311529",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="3.22122547",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="3.579139411",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="3.937053352",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="4.294967296",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="5.726623061",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="7.158278826",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="8.589934591",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="10.021590356",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="11.453246121",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="12.884901886",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="14.316557651",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="15.748213416",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="17.179869184",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="22.906492245",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="28.633115306",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="30.0",} 5.0 ohi_resource_timer_seconds_bucket{method="GET",resource="/currentproperties",status="200",le="+Inf",} 5.0 ohi_resource_timer_seconds_count{method="GET",resource="/currentproperties",status="200",} 5.0 ohi_resource_timer_seconds_sum{method="GET",resource="/currentproperties",status="200",} 0.391383318
And finally, verbatim "/prometheus" endpoint output for the same "ohi.resource.timer" with percentile distributions enabled:
# HELP ohi_resource_timer_seconds_max Resource timer # TYPE ohi_resource_timer_seconds_max gauge ohi_resource_timer_seconds_max{method="GET",resource="/currentproperties",status="200",} 0.280607455 # HELP ohi_resource_timer_seconds Resource timer # TYPE ohi_resource_timer_seconds summary ohi_resource_timer_seconds{method="GET",resource="/currentproperties",status="200",quantile="0.5",} 0.027262976 ohi_resource_timer_seconds{method="GET",resource="/currentproperties",status="200",quantile="0.75",} 0.087031808 ohi_resource_timer_seconds{method="GET",resource="/currentproperties",status="200",quantile="0.95",} 0.284164096 ohi_resource_timer_seconds_count{method="GET",resource="/currentproperties",status="200",} 5.0 ohi_resource_timer_seconds_sum{method="GET",resource="/currentproperties",status="200",} 0.438916162
Note that the values for the 'max' gauge and percentiles are bound to a time window of 2 minutes. For the 'max' gauge for example, it means that its value is the maximum value during the time window. If no new values are recorded for the time window length, the 'max' gauge will be reset to 0.0 as a new time window starts.
Determine Timers for which Metrics are Recorded and Published
For more fine-grained control, combine the following properties to configure for which timers the application collects and publishes metric data:
-
ohi.instrumentation.<timer>.regex.tagname
, where placeholder <timer> is the name of the timer. -
ohi.instrumentation.<timer>.regex
, where placeholder <timer> is the name of the timer.
For the given timer, the application verifies if the tag name matches the regular expression. Metrics data for the timer is published if that is the case.
For example, to collect metrics data for timer "ohi.resource.timer" for specific HTTP API resources (also known as IP resources), configure the following properties:
ohi.instrumentation.ohi.resource.timer.regex.tagname=resource ohi.instrumentation.ohi.resource.timer.regex=^(?!\\/generic\\/).+
Generic HTTP API resources are identified by resource paths starting with "/generic/" (note that "generic" is the resource that provides an overview of the generic resources). The regular expression will not match these. As a result, the application records and publishes metrics data for IP resources only.
Alternatively, to collect metrics data for timer "ohi.resource.timer" for generic resources only, thus ignoring timer data for IP resources, configure the following properties:
ohi.instrumentation.ohi.resource.timer.regex.tagname=resource ohi.instrumentation.ohi.resource.timer.regex=^\\/generic\\/.+
Configure ohi.resource.client.timer Resource Tag construction
The resource tag for an "ohi.resource.client.timer" should point to the name of the resource and not be more specific than that. For example, for a "/persons" resource with the following URL
/persons/1234
the value for the resource tag should be "/persons" and not the specific person resource "/persons/1234".
Using overly specific values like "/persons/1234" would lead to an explosion of "ohi.resource.client.timer" metrics being recorded and published. That is the reason for Oracle Health Insurance applications to stop after the first path segment when determining the resource tag for an "ohi.resource.client.timer".
If that first path segment is, for example, a (load balancer) context root that should be included but is not specific enough in order to identify the actual resource name then the system needs to be configured to continue to traverse the resource path until it encounters a path segment prefix that was not in the list of path segment prefixes to ignore.
Use the system property ohi.instrumentation.resourceclienttimer.segment.prefixes
to specify a comma-separated list of known segment prefixes.
For example, for an external "/persons" resource with the following URL
/loadbalancer-url/api/persons/1234
and an external "/providers" resource with the following URL
/provider-system-api/providers/ABCD
configure the property as follows:
ohi.instrumentation.resourceclienttimer.segment.prefixes=loadbalancer-url,api,provider-system-api
in order for Oracle Health Insurance applications to determine resource tag values "/loadbalancer-url/api/persons" and "/provider-system-api/providers" respectively.