6 Monitoring Exadata
6.1 Introducing Exadata Monitoring Tools and Information Sources
You can monitor Exadata system performance by using various tools and information sources. The following topics introduce and compare the main Exadata performance monitoring tools and information sources:
- Automatic Workload Repository (AWR)
AWR is the integrated performance diagnostic tool built into Oracle Database. AWR includes Exadata storage server configuration, health, and statistics information. - Database Statistics and Wait Events
Oracle Database provides performance information in a series of dynamic performance views (also known asV$
views). - Exadata Metrics
Exadata metrics are recorded observations of important properties or values relating to the Exadata system software. - Exadata Alerts
Alerts draw attention to potential and actual problems and other events of interest to an administrator. - Real-Time Insight
You can use the real-time insight feature to enable real-time monitoring of your Exadata systems. - ExaWatcher
ExaWatcher collects very detailed system statistics, primarily focused on the operating system and network interfaces.
Parent topic: Monitoring Exadata
6.1.1 Automatic Workload Repository (AWR)
AWR is the integrated performance diagnostic tool built into Oracle Database. AWR includes Exadata storage server configuration, health, and statistics information.
AWR collects information from various database dynamic performance views at regular intervals. Each data collection is an AWR snapshot. An AWR report shows statistics captured between two AWR snapshots.
AWR collects and reports a wealth of statistical data related to Oracle
Database, including database statistics and wait events from V$SYSSTAT
,
V$SYSTEM_EVENT
, V$SEGMENT_STATISTICS
,
V$SQL
.
Commencing with Oracle Database release 12.1.0.2, AWR also collects performance statistics from Exadata storage servers and includes them in the HTML version of the AWR report. Consequently, AWR provides comprehensive and unified performance information that includes Oracle Database statistics and Exadata statistics, storage metrics, and alerts from all storage servers.
Note:
The Exadata sections are only available in the HTML version of the AWR report.
The AWR report contains the following Exadata-specific sections:
Key benefits include:
- AWR provides comprehensive and unified performance information that includes Oracle Database metrics and Exadata storage metrics and alerts from all storage servers.
- AWR enables simple outlier detection and analysis.
- The granularity of AWR information is customizable and based on the AWR collection interval, which is hourly by default.
It is important to note that the Exadata storage server statistics are collected and maintained by the storage servers, so the information is not restricted to a specific database or database instance. Consequently, the storage related statistics in AWR includes the I/O for all databases running on the storage servers, while the database statistics relate to the specific instance or database hosting AWR.
When using a container database (CDB), the Exadata-specific sections of the AWR report are only available in the root container.
As part of Oracle Database, AWR is constantly being enhanced with new statistics and analysis. The availability of a specific section or statistic is subject to the version of Oracle Database being used.
Exadata Server Configuration
The Exadata Server Configuration section of the AWR report shows configuration information from all of the associated storage servers, including model information, software versions, and storage information relating to cell disks, grid disks, ASM disk groups, and the IORM objective. By examining this information, you can easily see any configuration differences across the storage servers that could potentially affect the behavior or performance of the system.
The following example shows part of the Exadata Server Configuration section. The example is based on a system with 14 X8-2 High Capacity storage servers. In the example, all of the servers have the same software version. However, the storage information shows that one cell contains 2 flash cell disks, while the others all have 4 flash cell disks.
Figure 6-1 AWR Report: Exadata Server Configuration

Description of "Figure 6-1 AWR Report: Exadata Server Configuration"
Exadata Server Health Report
The Exadata Server Health Report section of the AWR report shows information about offline disks and open storage server alerts.
When a disk is offline, then the total I/O bandwidth is reduced, which may affect I/O performance.
Alerts represent events of importance on a storage server that could affect its functionality and performance. Open alerts are alerts that have not been marked as examined by an administrator. For more details, see Exadata Alerts.
Ideally, there should be no open alerts in the AWR report, as shown in the following example:
The following example shows Exadata alert details indicating that the offload server was unable to start, which would affect smart scan performance on the system.
Figure 6-3 AWR Report: Exadata Alert Details

Description of "Figure 6-3 AWR Report: Exadata Alert Details"
Exadata Statistics
AWR includes statistics from various Exadata storage server components, such as Smart I/O, Smart Flash Cache, Smart Flash Log, XRMEM cache, XRMEM log, and IO Resource Manager. It includes I/O statistics from both the operating system and the cell server software, and performs simple outlier analysis on these I/O statistics. Because Exadata typically distributes I/O operations evenly across all cells and all disks, outliers may indicate a potential problem and warrant further investigation. If any cell or disk is behaving differently or performing more work, this could potentially cause slower performance for the whole system.
The Exadata Statistics section of the AWR report includes the following sub-sections:
- Performance Summary
- Exadata Resource Statistics
- Exadata Outlier Summary
- Exadata OS Statistics Outliers
- Exadata Cell Server Statistics Outliers
- Exadata Outlier Details
- Exadata OS Statistics Top
- Exadata Cell Server Statistics Top
- Exadata Smart Statistics
- Smart IO
- Flash Log
- XRMEM Log
- Flash Cache
- XRMEM Cache
- Exadata IO Reasons
- Top IO Reasons by Requests
- Top IO Reasons by MB
- Internal IO Reasons
- Exadata Top Database Consumers
- Top Databases by Requests
- Top Databases by Requests - Details
- Top Databases by Throughput
- Top Databases by Requests per Cell
- Top Databases by Requests per Cell - Details
- Top Databases by Throughput per Cell
- Exadata IO Latency Capping
- Cancelled IOs - Client
- Cancelled IOs - Cells
- Exadata Flash Wear
Key Exadata statistics are discussed in detail in the corresponding topics under Monitoring Oracle Exadata System Software Components.
6.1.2 Database Statistics and Wait Events
Oracle Database provides performance information in a series of dynamic
performance views (also known as V$
views).
The dynamic performance views contain a wealth of database statistics and wait events. By using dynamic performance views, a database administrator can perform detailed and customized monitoring on individual SQL operations and database sessions. The dynamic performance views are also a major information source for Automatic Workload Repository (AWR) and other Oracle Database monitoring tools, such as SQL monitor.
The following sections introduce the main dynamic performance views that include Exadata-specific information.
- V$SYSSTAT
- V$SESSION
- V$SESSION_EVENT
- V$SYSTEM_EVENT
- V$ACTIVE_SESSION_HISTORY
- V$SQL
- V$SEGMENT_STATISTICS
V$SYSSTAT
V$SYSSTAT
contains various system-wide statistics relating to
the current database instance, including I/O statistics related to Exadata storage
server.
For example, you can use the following query to display I/O statistics related to Exadata storage server:
SQL> SELECT name, value
FROM v$sysstat
WHERE name like '%cell%' or name like 'physical%'
ORDER BY name;
These statistics are also in the Global Activity Statistics or Instance Activity Statistics section of the AWR report.
V$SESSION
V$SESSION
contains various statistics and wait events for
currently active database sessions.
For example, you can use the following query to display a summary that shows the number of sessions that are currently waiting on cell events:
SQL> SELECT event, count(*)
FROM v$session
WHERE event like 'cell%'
AND wait_time = 0 -- session currently waiting
GROUP BY event
ORDER BY count(*) desc;
The following query displays event details for sessions that are currently waiting on cell events:
SQL> SELECT event, p1, p2, p3
FROM v$session
WHERE event like 'cell%'
AND wait_time = 0;
For cell wait events in V$SESSION
:
- The
P1
column identifies the cell hash number. You can use theV$CELL
view to identify the cell from the cell hash number. - The
P2
column identifies the disk hash number. You can use theV$ASM_DISK
view to identify the disk name from the disk hash number.
V$SESSION_EVENT
V$SESSION_EVENT
displays information on waits for an event by
a session.
For example, you can use the following query to display a summary of the wait events for the current session. By examining the output from this query, you can compare the time spent waiting for Exadata storage server events with the time spent waiting for other events in the database.
SQL> SELECT event, total_waits, time_waited_micro, time_waited_micro/decode(total_waits, 0, null, total_waits) avg_wait_time
FROM v$session_event
WHERE wait_class != 'Idle'
AND sid = userenv('SID')
ORDER BY time_waited_micro desc;
V$SYSTEM_EVENT
V$SYSTEM_EVENT
displays system-wide information on total
waits for an event.
You can use the following query to display a summary of the wait events since start-up of the current database instance:
SQL> SELECT event, total_waits, time_waited_micro, time_waited_micro/decode(total_waits, 0, null, total_waits) avg_wait_time
FROM v$system_event
WHERE wait_class != 'Idle'
ORDER by time_waited_micro desc;
Information from V$SYSTEM_EVENT
is contained in the Wait
Events Statistics section of the AWR report.
V$ACTIVE_SESSION_HISTORY
V$ACTIVE_SESSION_HISTORY
keeps a history of the active sessions on the
system, where active is defined as using CPU or waiting on a non-idle wait event. For each
session, V$ACTIVE_SESSION_HISTORY
includes a wealth of information on what
the session was doing, such as SQL execution plans and wait event details, including
information about Exadata storage interactions.
For example, you can use the following query to display cell block reads over the past 5 minutes with latency in excess of 10ms:
SQL> SELECT sample_time, p1, p2, time_waited
FROM v$active_session_history
WHERE event = 'cell single block physical read'
AND wait_time = 0 -- currently waiting
AND sample_time > sysdate - 5/1440 -- past 5 minutes
AND time_waited > 10000
ORDER by time_waited desc;
The Active Session History (ASH) Report section of the AWR report is based on
V$ACTIVE_SESSION_HISTORY
. Oracle Database also provides a more detailed
Active Session History (ASH) report.
V$SQL
V$SQL
, and related views like V$SQLAREA
,
V$SQLAREA_PLAN_HASH
, V$SQLSTATS
, and
V$SQLSTATS_PLAN_HASH
, contain information and statistics for SQL
statements processed on the database instance.
V$SQL
, and related views, include the following columns that contain
information about Exadata storage interactions:
PHYSICAL_READ_BYTES
— number of bytes read by the SQLPHYSICAL_READ_REQUESTS
— number of read requests issued by the SQLPHYSICAL_WRITE_BYTES
— number of bytes written by the SQLPHYSICAL_WRITE_REQUESTS
— number of write requests issued by the SQLIO_CELL_OFFLOAD_ELIGIBLE_BYTES
— number of bytes eligible for predicate offload to Exadata storage serverIO_CELL_OFFLOAD_RETURNED_BYTES
— number of bytes returned by smart scansIO_INTERCONNECT_BYTES
— number of bytes exchanged between the database and the storage serversIO_CELL_UNCOMPRESSED_BYTES
— number of uncompressed bytes read, where uncompressed bytes is the size after decompressionOPTIMIZED_PHY_READ_REQUESTS
— number of read requests satisfied from Exadata Smart Flash Cache, or read requests avoided due to storage index or columnar cache
The following example shows a query that displays Exadata offload processing
performance data. The output focuses on a query that scans the SALES
table.
The output shows that all of the data in the SALES
table (approximately 5
GB) was eligible for offload processing. In this case, because of offload processing, even
though the query performed 5385.75 MB of I/O, only 417.65 MB of data was delivered over the
network to the database host.
SQL> SELECT sql_text,
io_cell_offload_eligible_bytes/1024/1024 offload_eligible_mb,
io_cell_uncompressed_bytes/1024/1024 io_uncompressed_mb,
io_interconnect_bytes/1024/1024 io_interconnect_mb,
io_cell_offload_returned_bytes/1024/1024 cell_return_bytes_mb,
(physical_read_bytes + physical_write_bytes)/1024/1024 io_disk_mb
FROM v$sql
WHERE UPPER(sql_text) LIKE '%FROM SALES%';
SQL_TEXT OFFLOAD_ELIGIBLE_MB IO_UNCOMPRESSED_MB IO_INTERCONNECT_MB CELL_RETURN_BYTES_MB IO_DISK_MB
---------------------------- ------------------- ------------------ ------------------ -------------------- –---------
select count(*) from sales 5283.06 5283.06 520.34 417.65 5385.75
Information from V$SQL
is displayed in the SQL Statistics section of the
AWR report.
V$SEGMENT_STATISTICS
V$SEGMENT_STATISTICS
contains statistics on a per-segment
basis. The segment-level statistics can be used to detect specific objects, such as tables
or indexes, that are performing optimized reads from Exadata storage.
The optimized physical read
segment statistic records the
number of read requests for objects that are read from Exadata Smart Flash Cache or reads that are avoided through the use of storage index or
columnar cache. The optimized physical writes
statistic records the number
of write requests for an object that first went to Exadata Smart Flash Cache. Such write requests can be synchronized to the disk later in a
lazy manner to free up cache space.
The following example query displays objects that are associated with more than 1000 optimized reads from Exadata storage. A similar query can be used to determine the objects that are associated with very few optimized reads.
SQL> SELECT object_name, value
FROM v$segment_statistics
WHERE statistic_name='optimized physical reads'
AND value>1000
ORDER BY value;
Information from V$SEGMENT_STATISTICS
is displayed in the Segment
Statistics section of the AWR report.
6.1.3 Exadata Metrics
Exadata metrics are recorded observations of important properties or values relating to the Exadata system software.
Exadata metrics contain detailed statistics for most Exadata components. Most metrics relate to the storage server and its components, such as flash cache, cell disks, and grid disks. Storage server metrics enable detailed monitoring of Exadata storage server performance.
Metrics are of the following types:
-
Cumulative metrics are statistics that accumulate over time since the metric was created or the server was restarted.
For example,
CL_IO_RQ_NODATA
is the total number of I/O requests that didn't return data. -
Instantaneous metrics contain the current value at the time of the metric observation.
For example,
CL_MEMUT
is the percentage of total physical memory currently used on the server. -
Rate metrics are computed statistics where the value is observed over time.
For example,
N_NIC_KB_TRANS_SEC
is the number of kilobytes per second transmitted over the server's Ethernet interfaces.
Some metrics differentiate between small I/O and large I/O. For such metrics, small I/O means I/O that is less than or equal to 128 KB in size. Large I/O is greater than 128 KB in size.
By default, metric collections occur at 1-minute intervals. Metric observations in the default collection are initially recorded in memory and later written to a disk-based repository. For most metrics, seven days of metric observations are maintained by default. However, a subset of key metrics are retained for up to one year. In all cases, historical metric observations are purged automatically if the server detects a storage space shortage.
Commencing with Oracle Exadata System Software 22.1.0, you can optionally configure fine-grained metrics. To enable fine-grained metrics, you must specify a collection interval between 1 and 60 seconds. You can also choose the metrics to include in the fine-grained collection. Fine-grained metric collection is the foundation for real-time metric streaming. Consequently, fine-grained metrics are only recorded in memory to support efficient streaming to an external metric collection and visualization platform.
You can use the CellCLI LIST
command as the primary means to
display Exadata metric observations. See Displaying Metrics. However, because
Exadata metrics are managed within each Exadata server, metric observations must be
collected and correlated by some additional means to gain a system-wide view.
Related Topics
6.1.4 Exadata Alerts
Alerts draw attention to potential and actual problems and other events of interest to an administrator.
There are three types of alerts; informational alerts, warning alerts, and critical alerts.
Metrics observations are automatically compared with stored thresholds. When a metric crosses a specified threshold, an alert is automatically generated. Metrics can be associated with warning and critical thresholds. When a metric value crosses a warning threshold, then a warning alert is generated. When a metric crosses a critical threshold, then a critical alert is generated.
Exadata administrators can specify the warning and critical threshold for most metrics. There are also some preset thresholds. For example, there are built-in thresholds for ambient temperature. If the temperature is too low or too high, an alert is automatically generated.
Some system conditions and state changes can generate informational alerts; for example, a server chassis is opened. More serious system errors and state changes can also generate warning or critical alerts; for example, a hard disk device is reporting disk errors.
This system of thresholds and alerts enables administrators to focus on the most important events, while maintaining awareness other interesting events.
You can use CellCLI commands to manage and display Exadata alerts and thresholds. You can also configure alert notification using email and SNMP. Alert information is also included in the AWR report.
Like Exadata metrics, alerts are managed separately on each server.
Related Topics
6.1.5 Real-Time Insight
You can use the real-time insight feature to enable real-time monitoring of your Exadata systems.
Commencing with Oracle Exadata System Software 22.1.0, real-time insight provides infrastructure to:
-
Categorize specific metrics as fine-grained, and enable the collection of fine-grained metrics as often as every second.
-
Stream metric observations to user-defined locations in real-time, using either push (upload) or pull (download) transmission models.
Related Topics
6.1.6 ExaWatcher
ExaWatcher collects very detailed system statistics, primarily focused on the operating system and network interfaces.
ExaWatcher runs separately on each server and collects server and
network-related data using operating system tools and utilities, such as
iostat
, vmstat
, mpstat
,
top
. It also collects storage server statistics using
cellsrvstat
.
The information is very granular, with most operating system statistics
collected at 5-second intervals. To extract statistics from ExaWatcher for analysis, you
can use the GetExaWatcherResults.sh
script. The output from
GetExaWatcherResults.sh
also include charts for a small subset of
key statistics.
ExaWatcher is typically used for detailed debugging. Consequently, it is recommended to keep baselines of ExaWatcher data for comparison purposes.
6.2 Guidelines for Exadata Monitoring
Regardless of the information source, it is considered good practice to maintain baseline collections of the various performance data sources covering the important periods of your workload, such as peak periods, month-end or year-end processing, or before patching or upgrades.
Baseline information enables a comparison with a known 'normal' state when a performance regression occurs. Such comparison helps to identify changes and guide further analysis.
Baseline data should include AWR data (using AWR exports or AWR baselines), Exadata metrics, and ExaWatcher collections. For data warehousing workloads that rely on smart scans, keeping a collection of SQL monitor reports for key SQL statements is also highly recommended.
Parent topic: Monitoring Exadata
6.3 Monitoring Oracle Exadata System Software Components
- Monitoring Exadata Smart Flash Cache
- Monitoring XRMEM Cache
- Monitoring PMEM Cache
- Monitoring Exadata Smart Flash Log
- Monitoring XRMEM Log
- Monitoring PMEM Log
- Monitoring Smart I/O
- Monitoring I/O Resource Management (IORM)
- Monitoring Cell Disk I/O
- Monitoring Grid Disks
- Monitoring Host Interconnect Metrics
Host interconnect metrics provide information about the I/O transmission for hosts that access cell storage.
Parent topic: Monitoring Exadata
6.3.1 Monitoring Exadata Smart Flash Cache
Exadata Smart Flash Cache holds frequently accessed data in flash storage, while most data is kept in very cost-effective disk storage. Caching occurs automatically and requires no user or administrator effort. Exadata Smart Flash Cache can intelligently determine the data that is most useful to cache based on data usage, access patterns, and hints from the database that indicate the type of data being accessed.
Exadata Smart Flash Cache can operate in Write-Through mode or Write-Back mode. In Write-Through mode, database writes go to disk first and subsequently populate Flash Cache. If a flash device fails with Exadata Smart Flash Cache operating in Write-Through mode, there is no data loss because the data is already on disk.
In Write-Back mode, database writes go to Flash Cache first and later to disk. The contents of the Write-Back Flash Cache is persisted across reboots, eliminating any warm-up time needed to populate the cache. Write-intensive applications can benefit from Write-Back caching by taking advantage of the fast latencies provided by flash. The amount of disk I/O also reduces when the cache absorbs multiple writes to the same block before writing it to disk. However, if a flash device fails while using Write-Back mode, data that is not yet persistent to disk is lost and must be recovered from a mirror copy. For this reason, Write-Back mode is recommended in conjunction with using high redundancy (triple mirroring) to protect the database files.
Exadata Smart Flash Cache accelerates OLTP performance by providing fast I/O for frequently accessed data blocks. It also accelerates Decision Support System (DSS) performance by automatically caching frequently scanned data and temporary segments, providing columnar cache storage, and enabling other features that optimize the performance of large analytic queries and large loads.
On Extreme Flash (EF) storage servers, all of the data resides in flash. Consequently, Exadata Smart Flash Cache is not required for normal caching. However, in this case Exadata Smart Flash Cache is still used to host the Columnar Cache, which caches data in columnar format and optimizes various analytical queries.
Performance issues related to Exadata Smart Flash Cache
typically exhibit increased cell single block physical read
latencies
in the database. However, occasional long latencies associated with cell single
block physical read
do not necessarily indicate a performance issue with
Exadata Smart Flash Cache, and may simply indicate
that the read request was not satisfied using Exadata Smart Flash Cache.
- Monitoring Exadata Smart Flash Cache Using AWR
- Monitoring Exadata Smart Flash Cache Using Database Statistics and Wait Events
- Monitoring Exadata Smart Flash Cache Using Exadata Metrics
- What to Look For When Monitoring Exadata Smart Flash Cache
Parent topic: Monitoring Oracle Exadata System Software Components
6.3.1.1 Monitoring Exadata Smart Flash Cache Using AWR
Automatic Workload Repository (AWR) contains a wealth of information relating to Exadata Smart Flash Cache. Following are descriptions and examples of key sections in the AWR report that contain information about Exadata Smart Flash Cache. By reviewing these sections of the AWR report, administrators can understand how Exadata Smart Flash Cache operates.
Flash Cache Configuration and Space Usage
The Flash Cache Configuration section contains summary information including the caching mode (Write-Through or Write-Back), status and overall size. The Flash Cache Space Usage section provides summary statistics on space usage in Exadata Smart Flash Cache.
The following example shows Exadata Smart Flash Cache configured in Write-Back mode on all cells, with a size of almost 24 TB on each cell. In the example, the Flash Cache Space Usage section shows that:
-
Practically all of the Exadata Smart Flash Cache space is in use on each cell.
-
Approximately 70% of the allocated space is being used for OLTP, which is typically data that is read into the database buffer cache using block mode reads. The OLTP data is further categorized as follows:
-
Clean data is data that is unchanged since being read into the cache. In the example output, approximately 26% of the space in each cache contains clean OLTP data.
-
If cached OLTP data is updated and the update is written back to the primary storage (usually a hard disk drive), the data is classified as synced. In the example output, approximately 23% of the space in each cache contains synced OLTP data.
-
When cached OLTP data is updated, but the update has not been written to the primary storage, the data is classified as unflushed. In the example output, approximately 22% of the space in each cache contains unflushed OLTP data.
-
-
Approximately 20% of the cache space is being used to absorb large writes.
-
Approximately 6% of the cache supports scan operations.
-
Approximately 2% of the cache is being used to support the columnar cache.
-
In this case, none of the cache space is occupied by keep objects. Keep objects are database segments (tables, partitions, and so on) that override the default caching policy for Exadata Smart Flash Cache by using the
CELL_FLASH_CACHE
storage clause option with theKEEP
setting.
Figure 6-4 AWR Report: Flash Cache Configuration and Space Usage

Flash Cache User Reads
The Flash Cache User Reads sections show information about read requests, read throughput, and read efficiency from database clients. The statistics show the I/Os against different areas of Exadata Smart Flash Cache:
- OLTP - relates to block requests
- Scan - relates to scan requests
- Columnar - relates to columnar cache requests
- Keep - relates to read requests against the KEEP pool
Figure 6-5 AWR Report: Flash Cache User Reads

Flash Cache User Writes
The Flash Cache User Writes section shows information about write requests and write throughput for Exadata Smart Flash Cache in Write-Back mode.
In this section, First Writes indicate new data being written to Exadata Smart Flash Cache, while Overwrites indicate data being overwritten in Exadata Smart Flash Cache. First Writes also require writing additional flash cache metadata. Overwrites represents the disk writes that were avoided by using Exadata Smart Flash Cache in Write-Back mode.
Figure 6-6 AWR Report: Flash Cache User Writes

Flash Cache User Writes - Large Writes
The Flash Cache User Writes - Large Writes sections show information about large write requests that are absorbed and rejected by Exadata Smart Flash Cache in Write-Back mode.
Figure 6-7 AWR Report: Flash Cache User Writes - Large Writes

Flash Cache Internal Reads
The Flash Cache Internal Reads section shows reads from Exadata Smart Flash Cache that are performed by Oracle Exadata System Software. These statistics are populated when Exadata Smart Flash Cache is in Write-Back mode.
The Disk Writer IO Detail columns relate to internal I/Os performed to persist data from Exadata Smart Flash Cache to hard disk devices. These columns show the reads from flash and the various categories of writes to hard disk.
Figure 6-8 AWR Report: Flash Cache Internal Reads

Flash Cache Internal Writes
The Flash Cache Internal Writes section shows writes to Exadata Smart Flash Cache that are performed by Oracle Exadata System Software. The internal writes are I/Os that populate Exadata Smart Flash Cache in response to a cache read miss.
The statistics also include metadata writes that occur when Exadata Smart Flash Cache is in Write-Back mode. Metadata writes occur when a cacheline is used to cache new data.
Figure 6-9 AWR Report: Flash Cache Internal Writes

Parent topic: Monitoring Exadata Smart Flash Cache
6.3.1.2 Monitoring Exadata Smart Flash Cache Using Database Statistics and Wait Events
The following table describes various database statistics that are useful for
monitoring Exadata Smart Flash Cache. The statistics
are available in various dynamic performance views, including
V$SYSSTAT
, and may be displayed in the Global Activity Statistics or
Instance Activity Statistics section of an AWR report.
Statistic | Description |
---|---|
cell flash cache read hits |
The number of read requests that were satisfied by the cache |
cell flash cache read hits for smart IO |
The number of read requests for smart IO that were satisfied by the cache |
cell flash cache read hits for temp IO |
The number of read requests for temp IO that were satisfied by the cache |
cell flash read hits for controlfile reads |
The number of read requests for a database control file that were satisfied by the cache |
cell overwrites in flash |
Total number of write requests that overwrote an existing cacheline in Exadata Smart Flash Cache that had not been written to disk. In effect, this is the amount of disk I/O saved by using Write-Back mode. This statistic is incremented once per mirror write. |
cell partial writes in flash |
Total number of write requests written to both Exadata Smart Flash Cache and disk. Part of the data was written to flash, and the rest was written to disk. This statistic is incremented once per mirror write. |
cell writes to flash cache |
Total number of write requests written entirely to Exadata Smart Flash Cache. This statistic is incremented once per mirror write. |
cell writes to flash cache for temp IO |
The number of write requests for temporary segments that were absorbed by Exadata Smart Flash Cache |
physical read IO requests |
Number of physical read requests issued by the database. |
physical read requests optimized |
Total number of read requests satisfied by using Exadata Smart Flash Cache, and read requests avoided by using storage index or columnar cache. |
physical read total bytes |
Total amount of I/O bytes for reads issued by the database, whether or not the read was offloaded to the storage servers. |
physical read total bytes optimized |
Total number of bytes read from Exadata Smart Flash Cache, and bytes avoided by using storage index or columnar cache. |
physical read total IO requests |
Total number of read requests issued by the database for all activity including application, backup, recovery, and other utilities. |
physical write IO requests |
Number of physical write requests issued by the database. |
physical write total bytes |
Total number of IO bytes for writes issued by the database for all activity. |
physical write total bytes optimized |
Total number of bytes written to Exadata Smart Flash Cache. These writes are synchronized to disk in a lazy manner. |
physical write total IO requests |
Total number of write requests issued by the database for all activity, including application, backup, recovery and other utilities. |
The following table describes database wait events that are useful for
monitoring Exadata Smart Flash Cache. The wait events
are visible in various dynamic performance views, including V$SESSION
,
V$SYSTEM_EVENT
and V$SESSION_EVENT
, and may be
displayed in the Wait Event sections of the AWR report.
The latency of the cell single block physical read
event typically
indicates if the read is satisfied from Exadata Smart Flash Cache. The higher the latency, the higher the likelihood that the request was
not satisfied from Exadata Smart Flash Cache.
Wait Event | Description |
---|---|
cell list of blocks physical
read |
This wait event occurs during recovery or during buffer pre-fetching (rather than performing multiple single-block reads). It is used to monitor database blocks that must be changed as part of recovery and are read in parallel for the database. In
This wait event is equivalent to |
cell list of blocks read request |
This is a placeholder wait event associated with
|
cell multiblock physical read |
This wait event represents the time taken to perform all the I/Os for a multi-block database read. In
This wait event is equivalent to |
cell multiblock read request |
This is a placeholder wait event associated with
|
cell single block physical read |
This wait event represents the time taken to perform a single block database I/O. In
This wait event is equivalent to Commencing with the May 2022 Oracle Database release updates (versions 19.15.0.0.220419, 21.6.0.0.220419, and later), this wait event no longer includes I/O from Exadata Smart Flash Cache or database I/O using a Remote Direct Memory Access (RDMA) read. |
cell single block physical read: flash
cache |
This wait event represents the time taken to perform a single block database I/O from Exadata Smart Flash Cache. In
This wait event was introduced in the May 2022 Oracle
Database release updates and is present in Oracle Database versions
19.15.0.0.220419, 21.6.0.0.220419, and later. Previously,
|
cell single block read request |
This is a placeholder wait event associated with a single
block database I/O that is visible only during the wait period.
After the wait event ends, the placeholder is converted to the
appropriate wait event, which is typically one of the |
cell interconnect retransmit during physical
read |
This wait event appears during retransmission for an I/O
of a single-block or multiblock read. The cell hash number in the
|
The availability of a specific statistic or wait event is subject to the version of Oracle Database being used.
Parent topic: Monitoring Exadata Smart Flash Cache
6.3.1.3 Monitoring Exadata Smart Flash Cache Using Exadata Metrics
Exadata metrics that are related to Exadata Smart Flash Cache are identified in the Exadata storage server
METRICCURRENT
, METRICDEFINITION
, and
METRICHISTORY
objects as having
objectType=FLASHCACHE
.
Example 6-1 Displaying Flash Cache Metric Definitions
This example shows how to display the flash cache metric definitions that are available in the Oracle Exadata System Software.
CellCLI> LIST METRICDEFINITION ATTRIBUTES NAME,DESCRIPTION WHERE OBJECTTYPE = FLASHCACHE
FC_BYKEEP_DIRTY "Number of megabytes unflushed for keep objects on FlashCache"
FC_BYKEEP_OLTP "Number of megabytes for OLTP keep objects in flash cache"
FC_BYKEEP_OVERWR "Number of megabytes pushed out of the FlashCache because of space limit for keep objects"
FC_BYKEEP_OVERWR_SEC "Number of megabytes per second pushed out of the FlashCache because of space limit for keep objects"
FC_BYKEEP_USED "Number of megabytes used for keep objects on FlashCache"
FC_BY_ALLOCATED "Number of megabytes allocated in flash cache"
FC_BY_ALLOCATED_DIRTY "Number of megabytes allocated for unflushed data in flash cache"
FC_BY_ALLOCATED_OLTP "Number of megabytes allocated for OLTP data in flash cache"
FC_BY_DIRTY "Number of unflushed megabytes in FlashCache"
FC_BY_STALE_DIRTY "Number of unflushed megabytes in FlashCache which cannot be flushed because cached disks are not accessible"
FC_BY_USED "Number of megabytes used on FlashCache"
FC_COL_BYKEEP_USED "Number of megabytes used for keep objects in Columnar FlashCache"
FC_COL_BY_USED "Number of megabytes used in Columnar FlashCache"
FC_COL_IO_BYKEEP_R "Number of megabytes read from Columnar FlashCache for keep objects"
FC_COL_IO_BYKEEP_R_SEC "Number of megabytes read per second from Columnar FlashCache for keep objects"
FC_COL_IO_BY_R "Number of megabytes that were read from Columnar FlashCache"
FC_COL_IO_BY_R_ELIGIBLE "Number of megabytes eligible to read from Columnar FlashCache"
FC_COL_IO_BY_R_ELIGIBLE_SEC "Number of megabytes per second eligible to read from Columnar FlashCache"
FC_COL_IO_BY_R_SEC "Number of megabytes per second that were read from Columnar FlashCache"
FC_COL_IO_BY_SAVED "Number of megabytes saved by reads from Columnar FlashCache"
FC_COL_IO_BY_SAVED_SEC "Number of megabytes saved per second by reads from Columnar FlashCache"
FC_COL_IO_BY_W_POPULATE "Number of megabytes that are population writes into Columnar FlashCache due to read miss"
FC_COL_IO_BY_W_POPULATE_SEC "Number of megabytes per second that are population writes into Columnar FlashCache due to read miss"
FC_COL_IO_RQKEEP_R "Number of requests read for keep objects from Columnar FlashCache"
FC_COL_IO_RQKEEP_R_SEC "Number of requests read per second for keep objects from Columnar FlashCache"
FC_COL_IO_RQ_R "Number of requests that were read from Columnar FlashCache"
FC_COL_IO_RQ_R_ELIGIBLE "Number of reads eligible for Columnar FlashCache"
FC_COL_IO_RQ_R_ELIGIBLE_SEC "Number of reads per second eligible for Columnar FlashCache"
FC_COL_IO_RQ_R_SEC "Number of requests per second that were read from Columnar FlashCache"
FC_COL_IO_RQ_W_POPULATE "Number of requests that are population writes into Columnar FlashCache due to read miss"
FC_COL_IO_RQ_W_POPULATE_SEC "Number of requests per second that are population writes into Columnar FlashCache due to read miss"
FC_IO_BYKEEP_R "Number of megabytes read from FlashCache for keep objects"
FC_IO_BYKEEP_R_SEC "Number of megabytes read per second from FlashCache for keep objects"
FC_IO_BYKEEP_W "Number of megabytes written to FlashCache for keep objects"
FC_IO_BYKEEP_W_SEC "Number of megabytes per second written to FlashCache for keep objects"
FC_IO_BY_DISK_WRITE "Number of megabytes written from flash cache to hard disks"
FC_IO_BY_DISK_WRITE_SEC "Number of megabytes per second written from flash cache to hard disks"
FC_IO_BY_R "Number of megabytes of small reads (OLTP) from flash cache"
FC_IO_BY_R_ACTIVE_SECONDARY "Number of megabytes for active secondary reads satisfied from flash cache"
FC_IO_BY_R_ACTIVE_SECONDARY_MISS "Number of megabytes for active secondary reads not satisfied from flash cache"
FC_IO_BY_R_ACTIVE_SECONDARY_MISS_SEC "Number of megabytes per second for active secondary reads not satisfied from flash cache"
FC_IO_BY_R_ACTIVE_SECONDARY_SEC "Number of megabytes per second for active secondary reads satisfied from flash cache"
FC_IO_BY_R_DISK_WRITER "Number of megabytes read from flash cache by disk writer"
FC_IO_BY_R_DISK_WRITER_SEC "Number of megabytes per second read from flash cache by disk writer"
FC_IO_BY_R_DW "Number of megabytes of large reads (DW) from flash cache"
FC_IO_BY_R_DW_SEC "Number of megabytes of large reads (DW) per second from flash cache"
FC_IO_BY_R_MISS "Number of megabytes of small reads (OLTP) from disks because some of the requested data was not in flash cache"
FC_IO_BY_R_MISS_DW "Number of megabytes of large reads (DW) from disks because some of the requested data was not in flash cache"
FC_IO_BY_R_MISS_DW_SEC "Number of megabytes of large reads (DW) per second from disks because some of the requested data was not in flash cache"
FC_IO_BY_R_MISS_SEC "Number of megabytes of small reads (OLTP) per second from disks because some of the requested data was not in flash cache"
FC_IO_BY_R_SEC "Number of megabytes of small reads (OLTP) per second from flash cache"
FC_IO_BY_R_SKIP "Number of megabytes read from disks for IO requests that bypass FlashCache"
FC_IO_BY_R_SKIP_FC_THROTTLE "Number of megabytes read from disk for IO requests that bypass FlashCache due to heavy load on FlashCache"
FC_IO_BY_R_SKIP_FC_THROTTLE_SEC "Number of megabytes read per second from disk for IO requests that bypass FlashCache due to heavy load on FlashCache"
FC_IO_BY_R_SKIP_LG "Number of megabytes read from disk for IO requests that bypass FlashCache due to the large IO size"
FC_IO_BY_R_SKIP_LG_SEC "Number of megabytes read per second from disk for IO requests that bypass FlashCache due to the large IO size"
FC_IO_BY_R_SKIP_NCMIRROR "Number of megabytes read from disk for IO requests that bypass FlashCache as the IO is on non-primary, non-active secondary mirror"
FC_IO_BY_R_SKIP_SEC "Number of megabytes read from disks per second for IO requests that bypass FlashCache"
FC_IO_BY_W "Number of megabytes written to FlashCache"
FC_IO_BY_W_DISK_WRITER "Number of megabytes written to hard disks by disk writer"
FC_IO_BY_W_DISK_WRITER_SEC "Number of megabytes per second written to hard disks by disk writer"
FC_IO_BY_W_FIRST "Number of megabytes that are first writes into flash cache"
FC_IO_BY_W_FIRST_SEC "Number of megabytes per second that are first writes into flash cache"
FC_IO_BY_W_LG_CHINT "Number of megabytes used for large writes to flash cache due to cache hint"
FC_IO_BY_W_LG_DTAGE "Number of megabytes used for data aging large writes to flash cache"
FC_IO_BY_W_LG_MRCV "Number of megabytes used for media recovery large writes to flash cache"
FC_IO_BY_W_METADATA "Number of megabytes that are flash cache metadata writes"
FC_IO_BY_W_METADATA_SEC "Number of megabytes per second that are flash cache metadata writes"
FC_IO_BY_W_OVERWRITE "Number of megabytes that are overwrites into flash cache"
FC_IO_BY_W_OVERWRITE_SEC "Number of megabytes per second that are overwrites into flash cache"
FC_IO_BY_W_POPULATE "Number of megabytes that are population writes into flash cache due to read miss"
FC_IO_BY_W_POPULATE_SEC "Number of megabytes per second that are population writes into flash cache due to read miss"
FC_IO_BY_W_SEC "Number of megabytes per second written to FlashCache"
FC_IO_BY_W_SKIP "Number of megabytes written to disk for IO requests that bypass FlashCache"
FC_IO_BY_W_SKIP_FC_THROTTLE "Number of megabytes written to disk for IO requests that bypass FlashCache due to heavy load on FlashCache"
FC_IO_BY_W_SKIP_FC_THROTTLE_SEC "Number of megabytes written per second to disk for IO requests that bypass FlashCache due to heavy load on FlashCache"
FC_IO_BY_W_SKIP_LG "Number of megabytes written to disk for IO requests that bypass FlashCache due to the large IO size"
FC_IO_BY_W_SKIP_LG_SEC "Number of megabytes written per second to disk for IO requests that bypass FlashCache due to the large IO size"
FC_IO_BY_W_SKIP_NCMIRROR "Number of megabytes written to disk for IO requests that bypass FlashCache as the IO is on non-primary, non-active secondary mirror"
FC_IO_BY_W_SKIP_SEC "Number of megabytes written to disk per second for IO requests that bypass FlashCache"
FC_IO_ERRS "Number of IO errors on FlashCache"
FC_IO_RQKEEP_R "Number of read requests for keep objects from FlashCache"
FC_IO_RQKEEP_R_MISS "Number of read requests for keep objects which did not find all data in FlashCache"
FC_IO_RQKEEP_R_MISS_SEC "Number of read requests per second for keep objects which did not find all data in FlashCache"
FC_IO_RQKEEP_R_SEC "Number of read requests per second for keep objects from FlashCache"
FC_IO_RQKEEP_R_SKIP "Number of read requests for keep objects that bypass FlashCache"
FC_IO_RQKEEP_R_SKIP_SEC "Number of read requests per second for keep objects that bypass FlashCache"
FC_IO_RQKEEP_W "Number of requests for keep objects which resulted in FlashCache being populated with data"
FC_IO_RQKEEP_W_SEC "Number of requests per second for keep objects which resulted in FlashCache being populated with data"
FC_IO_RQ_DISK_WRITE "Number of requests written from flash cache to hard disks"
FC_IO_RQ_DISK_WRITE_SEC "Number of requests per second written from flash cache to hard disks"
FC_IO_RQ_R "Number of small reads (OLTP) satisfied from the flash cache"
FC_IO_RQ_REPLACEMENT_ATTEMPTED "Number of requests attempted to find space in the flash cache"
FC_IO_RQ_REPLACEMENT_DW_FAILED "Number of times that client DW IOs failed to find a replacement buffer"
FC_IO_RQ_REPLACEMENT_DW_SUCCEEDED "Number of times that client DW IOs succeeded in finding a replacement buffer"
FC_IO_RQ_REPLACEMENT_FAILED "Number of requests failed to find space in the flash cache"
FC_IO_RQ_REPLACEMENT_OLTP_FAILED "Number of times that client OLTP IOs failed to find a replacement buffer"
FC_IO_RQ_REPLACEMENT_OLTP_SUCCEEDED "Number of times that client OLTP IOs succeeded in finding a replacement buffer"
FC_IO_RQ_R_ACTIVE_SECONDARY "Number of requests for active secondary reads satisfied from flash cache"
FC_IO_RQ_R_ACTIVE_SECONDARY_MISS "Number of requests for active secondary reads not satisfied from flash cache"
FC_IO_RQ_R_ACTIVE_SECONDARY_MISS_SEC "Number of requests per second for active secondary reads not satisfied from flash cache"
FC_IO_RQ_R_ACTIVE_SECONDARY_SEC "Number of requests per second for active secondary reads satisfied from flash cache"
FC_IO_RQ_R_DISK_WRITER "Number of requests read from flash cache by disk writer"
FC_IO_RQ_R_DISK_WRITER_SEC "Number of requests per second read from flash cache by disk writer"
FC_IO_RQ_R_DW "Number of large reads (DW) satisfied from the flash cache"
FC_IO_RQ_R_DW_SEC "Number of large reads (DW) per second satisfied from the flash cache"
FC_IO_RQ_R_MISS "Number of small reads (OLTP) that did not find all data in flash cache"
FC_IO_RQ_R_MISS_DW "Number of large reads (DW) that did not find all data in flash cache"
FC_IO_RQ_R_MISS_DW_SEC "Number of large reads (DW) per second that did not find all data in flash cache"
FC_IO_RQ_R_MISS_SEC "Number of small reads (OLTP) per second that did not find all data in flash cache"
FC_IO_RQ_R_SEC "Number of small reads (OLTP) per second satisfied from the flash cache"
FC_IO_RQ_R_SKIP "Number of requests read from disk that bypass FlashCache"
FC_IO_RQ_R_SKIP_FC_THROTTLE "Number of requests read from disk that bypass FlashCache due to heavy load on FlashCache"
FC_IO_RQ_R_SKIP_FC_THROTTLE_SEC "Number of requests read from disk per second that bypassed FlashCache due to heavy load on FlashCache"
FC_IO_RQ_R_SKIP_LG "Number of requests read from disk that bypass FlashCache due to the large IO size"
FC_IO_RQ_R_SKIP_LG_SEC "Number of requests read from disk per second that bypass FlashCache due to the large IO size"
FC_IO_RQ_R_SKIP_NCMIRROR "Number of requests read from disk that bypass FlashCache as the IO is on non-primary, non-active secondary mirror"
FC_IO_RQ_R_SKIP_SEC "Number of requests read from disk per second that bypass FlashCache"
FC_IO_RQ_W "Number of requests which resulted in FlashCache being populated with data"
FC_IO_RQ_W_DISK_WRITER "Number of requests written to hard disks by disk writer"
FC_IO_RQ_W_DISK_WRITER_SEC "Number of requests per second written to hard disks by disk writer"
FC_IO_RQ_W_FIRST "Number of requests that are first writes into flash cache"
FC_IO_RQ_W_FIRST_SEC "Number of requests per second that are first writes into flash cache"
FC_IO_RQ_W_LG_CHINT "Number of large writes to flash cache due to cache hint"
FC_IO_RQ_W_LG_DTAGE "Number of data aging large writes to flash cache"
FC_IO_RQ_W_LG_MRCV "Number of media recovery large writes to flash cache"
FC_IO_RQ_W_METADATA "Number of requests that are flash cache metadata writes"
FC_IO_RQ_W_METADATA_SEC "Number of requests per second that are flash cache metadata writes"
FC_IO_RQ_W_OVERWRITE "Number of requests that are overwrites into flash cache"
FC_IO_RQ_W_OVERWRITE_SEC "Number of requests per second that are overwrites into flash cache"
FC_IO_RQ_W_POPULATE "Number of requests that are population writes into flash cache due to read miss"
FC_IO_RQ_W_POPULATE_SEC "Number of requests per second that are population writes into flash cache due to read miss"
FC_IO_RQ_W_SEC "Number of requests per second which resulted in FlashCache being populated with data"
FC_IO_RQ_W_SKIP "Number of requests written to disk that bypass FlashCache"
FC_IO_RQ_W_SKIP_FC_THROTTLE "Number of requests written to disk that bypass FlashCache due to heavy load on FlashCache"
FC_IO_RQ_W_SKIP_FC_THROTTLE_SEC "Number of requests written to disk per second that bypass FlashCache due to heavy load on FlashCache"
FC_IO_RQ_W_SKIP_LG "Number of requests written to disk that bypass FlashCache due to the large IO size"
FC_IO_RQ_W_SKIP_LG_SEC "Number of requests written to disk per second that bypass FlashCache due to the large IO size"
FC_IO_RQ_W_SKIP_NCMIRROR "Number of requests written to disk that bypass FlashCache as the IO is on non-primary, non-active secondary mirror"
FC_IO_RQ_W_SKIP_SEC "Number of requests written to disk per second that bypass FlashCache"
Note the following additional details:
-
Space in Exadata Smart Flash Cache is internally managed in chunks known as cachelines.
FC_BY_ALLOCATED
represents the amount of space (in megabytes) that is allocated to cachelines in the flash cache. If the value is close to the flash cache size, then the flash cache is fully populated. -
FC_BY_USED
represents amount of space (in megabytes) that is occupied by data in the flash cache. For some workloads, like OLTP, theFC_BY_USED
value can be much less than the value ofFC_BY_ALLOCATED
because an OLTP write might only use a small fraction of a cacheline. The total amount used by OLTP cachelines is tracked byFC_BY_ALLOCATED_OLTP
. -
Various flash cache metrics having names ending with
_DW
track large read operations, which are typically associated with Data Warehouse workloads. Each of these metrics has a corresponding metric that tracks small read operations, which are typically associated with OLTP operations. To get a total count, add the large read metric value to the corresponding small read metric value.For example, to get the total number of reads satisfied from flash cache, add
FC_IO_RQ_R_DW
toFC_IO_RQ_R
.
Parent topic: Monitoring Exadata Smart Flash Cache
6.3.1.4 What to Look For When Monitoring Exadata Smart Flash Cache
Increased Read Latencies
Possible issues related to Exadata Smart Flash Cache tend to be visible in the database as increased read latencies,
specifically in the cell single block physical read
wait event. The
increased latency is usually caused when reads are issued against the hard disks
instead of being satisfied by Exadata Smart Flash Cache.
Read requests may not be satisfied from Exadata Smart Flash Cache when:
-
The required data is not in Exadata Smart Flash Cache.
In this case, requests are recorded as flash cache misses, which are visible as increased misses in the Flash Cache User Reads section of the AWR report. This is also typically accompanied by increased population writes, which are visible in the Flash Cache Internal Writes section of the AWR report.
On the database, you may see a reduction in the number of
cell flash cache read hits
compared with thephysical read IO requests
orphysical read total IO requests
. -
The data is not eligible for Exadata Smart Flash Cache.
To maximize cache efficiency, when an I/O request is sent from the database to the storage servers, it includes a hint that indicates whether or not the data should be cached in Exadata Smart Flash Cache.
If data is not eligible for caching, the corresponding I/O is visible in the Flash Cache User Reads - Skips section of the AWR report, along with the reason why the read was not eligible. Possible reasons include:
- The grid disk caching policy is set to
none
. - The database segment is configured with the
CELL_FLASH_CACHE NONE
storage option. - The I/O type and situation precludes caching. For example, the I/O is related to an RMAN backup operation and the hard disks are not busy. If this is a common occurrence, it will be evident in Top IO Reasons section of the AWR report
- The grid disk caching policy is set to
In some cases, the cell single block physical read
latency may
increase, with no apparent difference in Exadata Smart Flash Cache behavior. This can be caused by increased I/O load, especially
on hard disks.
Occasional long latencies in cell single block physical read
,
usually visible as a long tail in the cell single block physical
read
histogram may simply indicate that the data is read from hard
disk, and does not necessarily indicate a problem with Exadata Smart Flash Cache.
Skipped Writes
When using Exadata Smart Flash Cache in Write-Back mode, most database write requests are absorbed by the cache. However, some write requests may skip the cache. The most common reason for skipping the cache is when the request writes a large amount of data that will be read once, such as temp sorts, or it is not expected to be read at all in the foreseeable future, such as backups and archiving.
Another common reason for skipping large writes is Disk Not Busy
,
which typically means that there is no benefit to using Exadata Smart Flash Cache because the hard disks have sufficient capacity
to handle the write requests.
If skipping Exadata Smart Flash Cache for large
writes is causing a performance issue, then it is typically visible in the database
with corresponding long latencies for the direct path write
or
direct path write temp
wait events.
The reasons for rejecting large writes are visible in the Flash Cache User Writes - Large Write Rejections section of the AWR report.
Database Working Set Size
The database working set refers to the subset of most commonly accessed information in the database. In most cases, the database working set is fairly stable. However, if for any reason, the working set does not fit in Exadata Smart Flash Cache, you may see the following symptoms:
- Increased cache misses, indicating that data is not in Exadata Smart Flash Cache. This is visible in
the Flash Cache User Reads section of the AWR report, or the
FC_IO_RQ_R_MISS_SEC
cell metric. - Increased population activity to populate data not in Exadata Smart Flash Cache. This is visible in
the Flash Cache Internal Writes section of the AWR report, or the
FC_IO_[BY|RQ]_W_POPULATE_SEC
cell metrics. - Increased disk writer activity, indicating that dirty cachelines
have to be written to disk, so that the cacheline can be reused to cache other
data. Disk writer activity is visible in the Flash Cache Internal Reads section
of the AWR report, or the
FC_IO_[RQ|BY]_[R|W]_DISK_WRITER_SEC
cell metrics. - Increased first writes, indicating new data is being written to Exadata Smart Flash Cache. A large number of
first writes with few overwrites means new data is being written to Exadata Smart Flash Cache. This is visible in
the Flash Cache User Writes section of the AWR report, or in the
FC_IO_[RQ|BY]_W_FIRST_SEC
cell metrics.
In this case:
- Review the database access patterns for tuning opportunities that reduce the amount of data being accessed.
- Consider increasing the number of available storage servers to deliver more space for Exadata Smart Flash Cache.
- Review your I/O Resource Management (IORM) quotas for Exadata Smart Flash Cache and allocate space where it is most required.
- Consider using Extreme Flash storage servers to eliminate the disk I/Os.
Miscellaneous Issues
There are some cases when the increased cell single block
physical read
latency may not be due to cell performance, but may be
caused by something else along the IO path, such as network contention or contention
for database server CPU resources.
The histograms for cell single block physical read
and
small reads are available in the Single Block Reads and Small Read Histogram -
Detail sections of the AWR report under Exadata Statistics > Performance Summary.
The cell single block physical read
histogram shows latencies
measured by Oracle Database, while the small read histograms show latencies measured
in the storage server.
A histogram with a significant number of occasional long latencies is
said to have a long tail. When the histograms for cell single block physical
read
and small reads have long tails, then this is an indication of
slow read times in the storage server, which would warrant further investigation of
the other I/O performance statistics. See Monitoring Cell Disk I/O.
If the cell single block physical read
histogram has a
long tail that is not present in the small read histograms, then the cause is
generally not in the storage server, but rather something else in the I/O path, such
as bottlenecks in the network or contention for compute node CPU.
Parent topic: Monitoring Exadata Smart Flash Cache
6.3.2 Monitoring XRMEM Cache
Exadata RDMA Memory Cache (XRMEM cache) provides direct access to storage server memory using Remote Direct Memory Access (RDMA), enabling lower read latencies and faster response times. When database clients read from the XRMEM cache, the client software performs an RDMA read of the cached data, which bypasses the storage server software and results in much lower read latencies.
XRMEM cache works in conjunction with Exadata Smart Flash Cache. If available, data that is not in XRMEM cache may be retrieved from Exadata Smart Flash Cache.
Statistics from XRMEM cache are slightly
different when compared with other Exadata components. Because clients issue RDMA I/O
directly to XRMEM cache, the request does not go to
cellsrv
, so the storage server cannot account for the RDMA I/Os.
For this reason, there are no cell metrics for XRMEM cache
I/O. Instead, Oracle Database statistics account for the I/O that is performed using
RDMA.
Performance issues related to XRMEM cache
typically cause latency increases in the Oracle Database cell single block
physical read
wait events. However, bear in mind that Exadata Smart Flash Cache is still available to
service the requests, and although requests from Exadata Smart Flash Cache generally experience higher read latencies compared to XRMEM cache, the requests still benefit from the fast I/O
provided by flash.
- Monitoring XRMEM Cache Using AWR
- Monitoring XRMEM Cache Using Database Statistics and Wait Events
- Monitoring XRMEM Cache Using Exadata Metrics
- What to Look For When Monitoring XRMEM Cache
Parent topic: Monitoring Oracle Exadata System Software Components
6.3.2.1 Monitoring XRMEM Cache Using AWR
Automatic Workload Repository (AWR) contains information relating to XRMEM cache. Following are descriptions and examples of the most commonly used sections in the AWR report that contain information about XRMEM cache. By reviewing these sections of the AWR report, administrators can understand the operation of XRMEM cache.
Note that because storage servers cannot account for the RDMA I/Os, the
Oracle Database statistics are critical in understanding the use of XRMEM cache. The XRMEM cache
statistics in the Exadata-specific sections of the AWR report only include I/Os that as
serviced by cellsrv
, which are not RDMA I/Os.
Database IOs
A summary of Database IOs may be found in the Single Block Reads section of the AWR report. This summary contains information about the effectiveness of XRMEM cache. The Single Block Reads section is located in the AWR report under Exadata Statistics > Performance Summary.
In the following example, the overwhelming majority of all read I/O
requests (approximately 88%) are serviced using RDMA reads to XRMEM cache (cell RDMA reads
), at a
rate of more than 176,000 per second. Nearly all of the rest are satisfied using
non-RDMA XRMEM cache reads (cell xrmem cache
read hits
), while the remaining reads are serviced by using Exadata Smart Flash Cache (cell flash
cache read hits
). Notice the massive difference in throughput for each
different I/O type, which illustrates the power of XRMEM cache and RDMA.
Figure 6-10 AWR Report: Database IOs

XRMEM Cache Configuration and Space Usage
The XRMEM Cache Configuration section contains summary information including the caching mode (Write-Through or Write-Back) and overall size. The XRMEM Cache Space Usage section provides summary statistics on space usage in XRMEM cache.
The following example shows XRMEM cache configured in Write-Through mode on all cells, with a total size of approximately 1509 GB.
Figure 6-11 AWR Report: XRMEM Cache Configuration

The following example of the XRMEM Cache Space Usage section shows 172 GB of XRMEM cache spread evenly across 3 cells.
Figure 6-12 AWR Report: XRMEM Cache Space Usage

XRMEM Cache User Reads
The XRMEM Cache User Reads section
shows information about read requests, read throughput, and read efficiency. Because
storage servers cannot account for RDMA I/Os, the statistics only relate to non-RDMA
reads processed by cellsrv
. Hits are for I/Os satisfied using XRMEM cache, while misses indicate that the data was
not in XRMEM cache.
Figure 6-13 AWR Report: XRMEM Cache User Reads

XRMEM Cache Internal Writes
The XRMEM Cache Internal Writes section shows writes to XRMEM cache that are performed by Oracle Exadata System Software. The internal writes are I/Os that populate XRMEM cache.
Figure 6-14 AWR Report: XRMEM Cache Internal Writes

Parent topic: Monitoring XRMEM Cache
6.3.2.2 Monitoring XRMEM Cache Using Database Statistics and Wait Events
Note:
The availability of a specific statistic or wait event is subject to the version of Oracle Database being used.
The following table describes database statistics that are useful for
monitoring XRMEM cache. The statistics are available in
various dynamic performance views, including V$SYSSTAT
, and may be
displayed in the Global Activity Statistics or Instance Activity Statistics section of
an AWR report.
Statistic | Description |
---|---|
cell RDMA reads |
The number of successful XRMEM cache read requests using RDMA |
cell RDMA reads eligible |
The number of database reads that meet basic eligibility criteria for RDMA |
cell RDMA read hash table
probes |
The total number of RDMA hash table probes issued to determine the presence of data in XRMEM cache. Each eligible read is usually associated with a hash table probe. |
cell RDMA reads issued |
The total number of RDMA reads issued to retrieve data from XRMEM cache. An RDMA read is usually issued after each successful hash table probe. |
|
The total number of RDMA hash table probes that failed. Each statistic covers a specific failure reason. The sum of these failures accounts for most of the difference between
Some RDMA hash table probe failures are normal while XRMEM cache is being initialized. However, frequent failures or a large number of failures may indicate a problem requiring further investigation. |
|
The total number of RDMA reads that failed. Each statistic covers a specific failure reason. The sum of these failures accounts for most of the difference between
Normally, some RDMA read failures may occur. However, frequent failures or a large number of failures may indicate a problem requiring further investigation. |
cell RDMA reads rejected -
ineligible |
The number of database reads that failed basic eligibility criteria for RDMA |
cell xrmem cache read hits |
The number of non-RDMA read requests processed by
cellsrv resulting in a XRMEM cache hit
|
The following table describes database wait events that are useful for
monitoring XRMEM cache. The wait events are visible in
various dynamic performance views, including V$SESSION
,
V$SYSTEM_EVENT
and V$SESSION_EVENT
, and may be
displayed in the Wait Event sections of the AWR report.
Wait Event | Description |
---|---|
cell list of blocks physical
read |
This wait event occurs during recovery or during buffer pre-fetching (rather than performing multiple single-block reads). It is used to monitor database blocks that must be changed as part of recovery and are read in parallel for the database. In
This wait event is equivalent to |
cell list of blocks read request |
This is a placeholder wait event associated with
|
cell multiblock physical read |
This wait event represents the time taken to perform all the I/Os for a multi-block database read. In
This wait event is equivalent to |
cell multiblock read request |
This is a placeholder wait event associated with
|
cell single block physical read |
This wait event represents the time taken to perform a
single block database I/O, equivalent to This wait event does not include I/O from Exadata Smart Flash Cache, I/O from XRMEM cache, or database I/O using a Remote Direct Memory Access (RDMA) read. In
|
cell single block physical read:
RDMA |
This wait event represents the time taken to perform a single block database I/O using a Remote Direct Memory Access (RDMA) read. In
This wait event was introduced in the May 2022 Oracle
Database release updates and is present in Oracle Database versions
19.15.0.0.220419, 21.6.0.0.220419, and later. Previously,
|
cell single block physical read: xrmem
cache |
This wait event represents the time taken to perform a single block database I/O from XRMEM cache. Effective use of XRMEM cache coincides with extremely low latency for this wait event, which is typical for operations using RDMA. In
|
cell single block read request |
This is a placeholder wait event associated with a single
block database I/O that is visible only during the wait period.
After the wait event ends, the placeholder is converted to the
appropriate wait event, which is typically one of the |
cell interconnect retransmit during physical
read |
This wait event appears during retransmission for an I/O
of a single-block or multiblock read. The cell hash number in the
|
Parent topic: Monitoring XRMEM Cache
6.3.2.3 Monitoring XRMEM Cache Using Exadata Metrics
Exadata metrics that are related to XRMEM cache are identified in the Exadata storage server
METRICCURRENT
, METRICDEFINITION
, and
METRICHISTORY
objects as having
objectType=XRMEMCACHE
.
Note that because reads from XRMEM cache are primarily performed using RDMA calls from Oracle Database, there are no Exadata metrics that tally XRMEM cache I/O.
Example 6-2 Displaying XRMEM Cache Metric Definitions
This example shows how to display the XRMEM cache metric definitions that are available in the Oracle Exadata System Software.
CellCLI> LIST METRICDEFINITION ATTRIBUTES NAME,DESCRIPTION WHERE OBJECTTYPE = XRMEMCACHE
XRM_BY_ALLOCATED "Number of megabytes allocated in XRMEM cache"
Note the following additional details:
-
XRM_BY_ALLOCATED
represents the number of megabytes allocated in XRMEM cache and tracks how many cachelines are used in the XRMEM cache. If the value is close to the XRMEM cache size, then the XRMEM cache is fully populated.This metric is also available per database (
DB_XRM_BY_ALLOCATED
), and per PDB (PDB_XRM_BY_ALLOCATED
).
Parent topic: Monitoring XRMEM Cache
6.3.2.4 What to Look For When Monitoring XRMEM Cache
Database Working Set
Performance is maximized if the entire database working set fits into XRMEM cache. However, the XRMEM cache is smaller than Exadata Smart Flash Cache and it is likely that the database working set, or even the most frequently accessed portion of it, might not reside in XRMEM cache.
If the size of the working set increases you may observe more reads
being satisfied by Exadata Smart Flash Cache or
hard disk instead of XRMEM cache. This is magnified
when access patterns on the data are more random. Consequently, you may observe
increased read latencies, specifically in the cell single block physical
read
wait event.
Non-RDMA Reads
XRMEM cache is most effective when it is
used in conjunction with RDMA. However, in various situations requests to
cellsrv
might still be satisfied using XRMEM cache. In such cases, you will see the associated reads as
cell xrmem cache read hits
rather than cell RDMA
reads
.
A non-RDMA read may occur for the following reasons:
- Based on the type of data being read, Oracle Database may not be able to perform
an RDMA read. Direct reads and reads from the Oracle Database control file are
examples of this. To identify the cause you may be need to correlate with other
statistics or wait events. For example, significant activity against the control
files is usually accompanied by the
control file sequential read
wait event, or visible in the IOStat by Filetype section of the AWR report. - In the Oracle Database client, buffers may not be registered for RDMA. This normally occurs shortly after a client process starts or while XRMEM cache is populating.
- Inside XRMEM cache, the RDMA hash table may not contain the required metadata (or it may be marked as invalid). This normally occurs while XRMEM cache is populating.
- An RDMA read times out after exceeding the default lease time (20 ms).
- Memory limitations prevent the creation of memory structures that are required for RDMA.
- An RDMA read error occurs.
Parent topic: Monitoring XRMEM Cache
6.3.3 Monitoring PMEM Cache
Note:
This topic applies only to Oracle Exadata System Software releases before 23.1.0. Otherwise, see Monitoring XRMEM Cache.
Persistent Memory (PMEM) cache provides direct access to persistent memory on the storage servers using Remote Direct Memory Access (RDMA), enabling lower read latencies and faster response times. PMEM is available only in selected Exadata X8M and X9M storage server models. When database clients read from the PMEM cache, the client software performs an RDMA read of the cached data, which bypasses the storage server software and results in much lower read latencies.
PMEM cache works in conjunction with Exadata Smart Flash Cache. If available, data that is not in PMEM cache may be retrieved from Exadata Smart Flash Cache. Similarly, as data is written out of PMEM cache, it is written to Exadata Smart Flash Cache when using Write-Back mode.
Statistics from PMEM cache are slightly
different when compared with other Exadata components. Because clients issue RDMA I/O
directly to PMEM cache, the request does not go to
cellsrv
, so the storage server cannot account for the RDMA I/Os.
For this reason, there are no cell metrics for PMEM cache
I/O. Instead, Oracle Database statistics account for the I/O that is performed using
RDMA.
Performance issues related to PMEM cache
typically cause latency increases in the Oracle Database cell single block
physical read
wait events. However, bear in mind that Exadata Smart Flash Cache is still available to
service the requests, and although requests from Exadata Smart Flash Cache generally experience higher read latencies compared to PMEM cache, the requests still benefit from the fast I/O
provided by flash.
- Monitoring PMEM Cache Using AWR
- Monitoring PMEM Cache Using Database Statistics and Wait Events
- Monitoring PMEM Cache Using Exadata Metrics
- What to Look For When Monitoring PMEM Cache
Parent topic: Monitoring Oracle Exadata System Software Components
6.3.3.1 Monitoring PMEM Cache Using AWR
Note:
In Oracle Database versions that support Oracle Exadata System Software release 23.1.0 and later, all references to PMEM cache are renamed to XRMEM cache (Exadata RDMA Memory Cache). However, these parts of the AWR report still correspond to PMEM cache when the underlying Oracle Exadata Storage Server contains PMEM cache.
Automatic Workload Repository (AWR) contains information relating to PMEM cache. Following are descriptions and examples of the most commonly used sections in the AWR report that contain information about PMEM cache. By reviewing these sections of the AWR report, administrators can understand the operation of PMEM cache.
Note that because storage servers cannot account for the RDMA I/Os, the
Oracle Database statistics are critical in understanding the use of PMEM cache. The PMEM cache
statistics in the Exadata-specific sections of the AWR report only include I/Os that as
serviced by cellsrv
, which are not RDMA I/Os.
Apart from the sections described below, the AWR report also contains sections for PMEM Cache User Writes, and PMEM Cache Internal Reads. These sections are not described in detail because they relate to the use of PMEM cache in Write-Back mode, which is not generally recommended.
Database IOs
A summary of Database IOs may be found in the Single Block Reads section of the AWR report. This summary contains information about the effectiveness of PMEM cache. The Single Block Reads section is located in the AWR report under Exadata Statistics > Performance Summary.
In the following example, the overwhelming majority of all read I/O
requests (approximately 88%) are serviced using RDMA reads to PMEM cache (cell RDMA reads
), at a
rate of more than 176,000 per second. Nearly all of the rest are satisfied using
non-RDMA PMEM cache reads (cell xrmem cache
read hits
or cell pmem cache read hits
), while the
remaining reads are serviced by using Exadata Smart Flash Cache (cell flash cache read hits
). Notice the
massive difference in throughput for each different I/O type, which illustrates the
power of PMEM cache and RDMA.
Figure 6-15 AWR Report: Database IOs

PMEM Cache Configuration and Space Usage
The PMEM Cache Configuration section contains summary information including the caching mode (Write-Through or Write-Back) and overall size. The PMEM Cache Space Usage section provides summary statistics on space usage in PMEM cache.
The following example shows PMEM cache configured in Write-Through mode on all cells, with a total size of approximately 1509 GB.
Figure 6-16 AWR Report: PMEM Cache Configuration

The following example of the PMEM Cache Space Usage section shows 172 GB of PMEM cache spread evenly across 3 cells.
Figure 6-17 AWR Report: PMEM Cache Space Usage

PMEM Cache User Reads
The PMEM Cache User Reads section
shows information about read requests, read throughput, and read efficiency. Because
storage servers cannot account for RDMA I/Os, the statistics only relate to non-RDMA
reads processed by cellsrv
. Hits are for I/Os satisfied using PMEM cache, while misses indicate that the data was not
in PMEM cache.
Figure 6-18 AWR Report: PMEM Cache User Reads

PMEM Cache Internal Writes
The PMEM Cache Internal Writes section shows writes to PMEM cache that are performed by Oracle Exadata System Software. The internal writes are I/Os that populate PMEM cache.
Figure 6-19 AWR Report: PMEM Cache Internal Writes

Parent topic: Monitoring PMEM Cache
6.3.3.2 Monitoring PMEM Cache Using Database Statistics and Wait Events
Note:
-
The availability of a specific statistic or wait event is subject to the version of Oracle Database being used.
-
In Oracle Database versions that support Oracle Exadata System Software release 23.1.0 and later, all database statistics and wait events relating to PMEM cache are renamed to XRMEM cache (Exadata RDMA Memory Cache). However, these database statistics and wait events still correspond to PMEM cache when the underlying Oracle Exadata Storage Server contains PMEM cache.
The following table describes database statistics that are useful for
monitoring PMEM cache. The statistics are available in
various dynamic performance views, including V$SYSSTAT
, and may be
displayed in the Global Activity Statistics or Instance Activity Statistics section of
an AWR report.
Statistic | Description |
---|---|
cell RDMA reads |
The number of successful PMEM cache read requests using RDMA |
cell RDMA reads eligible |
The number of database reads that meet basic eligibility criteria for RDMA |
cell RDMA read hash table
probes |
The total number of RDMA hash table probes issued to determine the presence of data in PMEM cache. Each eligible read is usually associated with a hash table probe. |
cell RDMA reads issued |
The total number of RDMA reads issued to retrieve data from PMEM cache. An RDMA read is usually issued after each successful hash table probe. |
|
The total number of RDMA hash table probes that failed. Each statistic covers a specific failure reason. The sum of these failures accounts for most of the difference between
Some RDMA hash table probe failures are normal while PMEM cache is being initialized. However, frequent failures or a large number of failures may indicate a problem requiring further investigation. |
|
The total number of RDMA reads that failed. Each statistic covers a specific failure reason. The sum of these failures accounts for most of the difference between
Normally, some RDMA read failures may occur. However, frequent failures or a large number of failures may indicate a problem requiring further investigation. |
cell RDMA reads rejected -
ineligible |
The number of database reads that failed basic eligibility criteria for RDMA |
cell RDMA writes |
The number of successful PMEM cache write requests using RDMA |
|
The number of non-RDMA read requests processed by
The statistic is named |
|
The number of non-RDMA write requests processed by
The statistic is named |
The following table describes database wait events that are useful for
monitoring PMEM cache. The wait events are visible in
various dynamic performance views, including V$SESSION
,
V$SYSTEM_EVENT
and V$SESSION_EVENT
, and may be
displayed in the Wait Event sections of the AWR report.
Wait Event | Description |
---|---|
cell list of blocks physical
read |
This wait event occurs during recovery or during buffer pre-fetching (rather than performing multiple single-block reads). It is used to monitor database blocks that must be changed as part of recovery and are read in parallel for the database. In
This wait event is equivalent to |
cell list of blocks read request |
This is a placeholder wait event associated with
|
cell multiblock physical read |
This wait event represents the time taken to perform all the I/Os for a multi-block database read. In
This wait event is equivalent to |
cell multiblock read request |
This is a placeholder wait event associated with
|
cell single block physical read |
This wait event represents the time taken to perform a
single block database I/O, equivalent to Commencing with the May 2022 Oracle Database release
updates (versions 19.15.0.0.220419, 21.6.0.0.220419, and later),
this wait event no longer includes I/O from Exadata Smart Flash Cache, I/O
from PMEM cache, or database I/O using
a Remote Direct Memory Access (RDMA) read. With this change, another
wait event ( Before the May 2022 Oracle Database release updates, effective use of PMEM cache coincides with extremely low latency for this wait event, which is typical for operations using RDMA. In
|
|
This wait event represents the time taken to perform a single block database I/O from PMEM cache. Effective use of PMEM cache coincides with extremely low latency for this wait event, which is typical for operations using RDMA. This wait event was introduced in the May 2022 Oracle
Database release updates and is present in Oracle Database versions
19.15.0.0.220419, 21.6.0.0.220419, and later. Previously,
The wait event is named In
|
cell single block read request |
This is a placeholder wait event associated with a single
block database I/O that is visible only during the wait period.
After the wait event ends, the placeholder is converted to the
appropriate wait event, which is typically one of the |
cell interconnect retransmit during physical
read |
This wait event appears during retransmission for an I/O
of a single-block or multiblock read. The cell hash number in the
|
Parent topic: Monitoring PMEM Cache
6.3.3.3 Monitoring PMEM Cache Using Exadata Metrics
Exadata metrics that are related to PMEM cache are identified in the Exadata storage server
METRICCURRENT
, METRICDEFINITION
, and
METRICHISTORY
objects as having
objectType=PMEMCACHE
.
Note that because reads from PMEM cache are primarily performed using RDMA calls from Oracle Database, there are no Exadata metrics that tally PMEM cache I/O.
Example 6-3 Displaying PMEM Cache Metric Definitions
This example shows how to display the PMEM cache metric definitions that are available in the Oracle Exadata System Software.
CellCLI> LIST METRICDEFINITION ATTRIBUTES NAME,DESCRIPTION WHERE OBJECTTYPE = PMEMCACHE
PC_BY_ALLOCATED "Number of megabytes allocated in PMEM cache"
Note the following additional details:
-
PC_BY_ALLOCATED
represents the number of megabytes allocated in PMEM cache and tracks how many cachelines are used in the PMEM cache. If the value is close to the PMEM cache size, then the PMEM cache is fully populated.This metric is also available per database (
DB_PC_BY_ALLOCATED
), and per PDB (PDB_PC_BY_ALLOCATED
).
Parent topic: Monitoring PMEM Cache
6.3.3.4 What to Look For When Monitoring PMEM Cache
Database Working Set
Performance is maximized if the entire database working set fits into PMEM cache. However, the PMEM cache is much smaller than Exadata Smart Flash Cache and it is likely that the database working set, or even the most frequently accessed portion of it, might not reside in PMEM cache.
If the size of the working set increases you may observe more reads
being satisfied by Exadata Smart Flash Cache or
hard disk instead of PMEM cache. This is magnified when
access patterns on the data are more random. Consequently, you may observe increased
read latencies, specifically in the cell single block physical read
wait event.
Non-RDMA Reads
PMEM cache is most effective when it is used in conjunction with RDMA.
However, in various situations requests to cellsrv
might still be
satisfied using PMEM cache. In such cases, you will see
the associated reads as cell xrmem cache read hits
or cell
pmem cache read hits
rather than cell RDMA reads
.
A non-RDMA read may occur for the following reasons:
- Based on the type of data being read, Oracle Database may not be able to perform
an RDMA read. Direct reads and reads from the Oracle Database control file are
examples of this. To identify the cause you may be need to correlate with other
statistics or wait events. For example, significant activity against the control
files is usually accompanied by the
control file sequential read
wait event, or visible in the IOStat by Filetype section of the AWR report. - In the Oracle Database client, buffers may not be registered for RDMA. This normally occurs shortly after a client process starts or while PMEM cache is populating.
- Inside PMEM cache, the RDMA hash table may not contain the required metadata (or it may be marked as invalid). This normally occurs while PMEM cache is populating.
- An RDMA read times out after exceeding the default lease time (20 ms).
- Memory limitations prevent the creation of memory structures that are required for RDMA.
- An RDMA read error occurs.
Parent topic: Monitoring PMEM Cache
6.3.4 Monitoring Exadata Smart Flash Log
Exadata Smart Flash Log reduces the average latency for performance-sensitive redo log write I/O operations, thereby eliminating performance bottlenecks that may occur due to slow redo log writes. Exadata Smart Flash Log removes latency spikes by simultaneously performing redo log writes to two media devices. The redo write is acknowledged as soon as the first write to either media device completes.
Originally, Exadata Smart Flash Log was used to perform simultaneous writes to disk and flash storage. However, Oracle Exadata System Software release 20.1 adds a further optimization, known as Smart Flash Log Write-Back, that uses Exadata Smart Flash Cache in Write-Back mode instead of disk storage.
- Monitoring Exadata Smart Flash Log Using AWR
- Monitoring Exadata Smart Flash Log Using Database Statistics and Wait Events
- Monitoring Exadata Smart Flash Log Using Exadata Metrics
- What to Look For When Monitoring Exadata Smart Flash Log
Parent topic: Monitoring Oracle Exadata System Software Components
6.3.4.1 Monitoring Exadata Smart Flash Log Using AWR
Following are descriptions and examples of the Flash Log Statistics sections in the AWR report. By reviewing these sections of the AWR report, administrators can understand the operation of Exadata Smart Flash Log.
Flash Log
The Flash Log section contains summary statistics about the operation of Exadata Smart Flash Log, including the number and of writes to Exadata Smart Flash Log, the number outliers prevented by Exadata Smart Flash Log, and the number of times when Exadata Smart Flash Log is skipped.
The following show an example of the Flash Log section. As shown in the example, when Exadata Smart Flash Log operates optimally, I/O is evenly distributed across all of the cells and there are no skips.
Figure 6-20 AWR Report: Flash Log

If the Skip Count is greater than zero, then the Flash Log Skip Details section contains the reasons for skipping Exadata Smart Flash Log.
Redo Write Histogram
The Redo Write Histogram section provides histograms that show the
log file parallel write
latency from the database, along with
the redo write request completion
latency from the storage servers.
By comparing the database and cell latency histograms, you can determine if the high
latency outliers are related to processing bottlenecks on the storage servers.
Figure 6-21 AWR Report: Redo Write Histogram

Parent topic: Monitoring Exadata Smart Flash Log
6.3.4.2 Monitoring Exadata Smart Flash Log Using Database Statistics and Wait Events
The following table describes various database statistics that are useful for
monitoring redo write and Exadata Smart Flash Log
performance. The statistics are available in various dynamic performance views,
including V$SYSSTAT
, and may be displayed in the Global Activity
Statistics or Instance Activity Statistics section of an AWR report.
Statistic | Description |
---|---|
|
Number of redo writes where the redo write size is smaller than the
size indicated in parenthesis. For example, |
|
Total number of writes to the redo log files by the Oracle Database log writer (LGWR) process(es). |
The following table describes database wait events that are useful for
monitoring redo write and Exadata Smart Flash Log
performance. The wait events are visible in various dynamic performance views, including
V$SESSION
, V$SYSTEM_EVENT
and
V$SESSION_EVENT
, and may be displayed in the Wait Event sections of
the AWR report.
Wait Event | Description |
---|---|
log file parallel write |
The Oracle Database log writer (LGWR) process waits on this event when it is waiting for the completion of writes to the redo log file. Efficient use of Exadata Smart Flash Log is indicated by comparatively lower time waiting on this event. |
The availability of a specific statistic or wait event is subject to the version of Oracle Database being used.
Parent topic: Monitoring Exadata Smart Flash Log
6.3.4.3 Monitoring Exadata Smart Flash Log Using Exadata Metrics
Exadata metrics that are related to Exadata Smart Flash Log provide information about flash log utilization, such as the number of
megabytes written per second. Flash Log metrics are identified in the
METRICCURRENT
, METRICDEFINITION
, and
METRICHISTORY
objects as having
objectType=FLASHLOG
.
Example 6-4 Displaying Flash Log Metric Definitions
This example shows how to display the flash log metric definitions that are available in the Oracle Exadata System Software.
CellCLI> LIST METRICDEFINITION ATTRIBUTES NAME,DESCRIPTION WHERE OBJECTTYPE = FLASHLOG
FL_ACTUAL_OUTLIERS "The number of times redo writes to flash and disk both exceeded the outlier threshold"
FL_BY_KEEP "The amount of redo data saved on flash due to disk I/O errors"
FL_DISK_FIRST "The number of times redo writes first completed to disk"
FL_DISK_IO_ERRS "The number of disk I/O errors encountered by Smart Flash Logging"
FL_EFFICIENCY_PERCENTAGE "The efficiency of Smart Flash Logging expressed as a percentage"
FL_EFFICIENCY_PERCENTAGE_HOUR "The efficiency of Smart Flash Logging over the last hour expressed as a percentage"
FL_FLASH_FIRST "The number of times redo writes first completed to flash"
FL_FLASH_IO_ERRS "The number of flash I/O errors encountered by Smart Flash Logging"
FL_FLASH_ONLY_OUTLIERS "The number of times redo writes to flash exceeded the outlier threshold"
FL_IO_DB_BY_W "The number of MB written to hard disk by Smart Flash Logging"
FL_IO_DB_BY_W_SEC "The rate which is the number of MB per second written to hard disk by Smart Flash Logging"
FL_IO_FL_BY_W "The number of MB written to flash by Smart Flash Logging"
FL_IO_FL_BY_W_SEC "The rate which is the number of MB per second written to flash by Smart Flash Logging"
FL_IO_TM_W "Cumulative latency of all redo log writes"
FL_IO_TM_W_RQ "Average latency of all redo log writes"
FL_IO_W "The number of writes serviced by Smart Flash Logging"
FL_IO_W_SKIP_BUSY "The number of redo writes that could not be serviced by Smart Flash Logging because too much data had not yet been written to disk"
FL_IO_W_SKIP_BUSY_MIN "The number of redo writes during the last minute that could not be serviced by Smart Flash Logging because too much data had not yet been written to disk"
FL_IO_W_SKIP_DISABLED_GD "The number of redo writes that could not be serviced by Smart Flash Logging because it was disabled for the redo log's grid disk"
FL_IO_W_SKIP_IORM_LIMIT "The number of redo writes that could not be serviced by Smart Flash Logging because the IORM limit had been reached for the redo log's grid disk"
FL_IO_W_SKIP_IORM_PLAN "The number of redo writes that could not be serviced by Smart Flash Logging because it was disabled by the IORM plan"
FL_IO_W_SKIP_LARGE "The number of large redo writes that could not be serviced by Smart Flash Logging because the size of the data was larger than the amount of available space on any flash disk"
FL_IO_W_SKIP_LOG_ON_FLASH "The number of redo writes that could not be serviced by Smart Flash Logging because the redo log resided on flash"
FL_IO_W_SKIP_NO_BUFFER "The number of redo writes that could not be serviced by Smart Flash Logging because of lack of available buffers"
FL_IO_W_SKIP_NO_FL_DISKS "The number of redo writes that could not be serviced by Smart Flash Logging because there were no available Flash Log disks"
FL_PREVENTED_OUTLIERS "The number of times redo writes to disk exceeded the outlier threshold; these would have been outliers had it not been for Smart Flash Logging"
FL_RQ_TM_W "Cumulative latency of all redo log write requests (includes network and other processing overhead)"
FL_RQ_TM_W_RQ "Average latency of all redo log write requests"
FL_RQ_W "The number of redo log write requests serviced (includes requests which were not handled by Smart Flash Logging)"
FL_SKIP_OUTLIERS "The number of times redo writes to disk exceeded the outlier threshold when Smart Flash Logging was not used"
Note the following additional details:
-
FL_RQ_TM_W
tracks the cumulative redo log write request latency, which includes networking and other overhead. To determine the overhead component, get the latency overhead due to factors such as network and processing, you can use subtractFL_IO_TM_W
fromFL_RQ_TM_W
. -
FL_SKIP_OUTLIERS
tracks the number of outliers when redo log write skips use of the flash log. An outlier is a redo log write that exceeds 0.5 seconds.The use of the flash log can be skipped because of the reasons associated with
FL_IO_W_SKIP_DISABLED_GD
,FL_IO_W_SKIP_IORM_LIMIT
,FL_IO_W_SKIP_IORM_PLAN
,FL_IO_W_SKIP_LOG_ON_FLASH
, orFL_IO_W_SKIP_NO_FL_DISKS
. - For simultaneous writes to disk and flash storage, the disk
controller write cache can absorb some writes quicker than flash. Consequently,
it is normal for a significant proportion of redo log write operations to
complete to disk before flash. In some cases, the value of
FL_DISK_FIRST
can exceed the value ofFL_FLASH_FIRST
. However, this does not mean that the Exadata Smart Flash Logging feature is ineffective or unnecessary.
Parent topic: Monitoring Exadata Smart Flash Log
6.3.4.4 What to Look For When Monitoring Exadata Smart Flash Log
General Performance
Performance issues related to redo logging typically exhibit high latency
for the log file sync
wait event in the Oracle Database user
foreground processes, with corresponding high latency for log file parallel
write
in the Oracle Database log writer (LGWR) process. Because of the
performance-critical nature of redo log writes, occasional long latencies for
log file parallel write
may cause fluctuations in database
performance, even if the average log file parallel write
wait time
is acceptable.
If any of these are occurring, then it may be indicative of an issue with Exadata Smart Flash Log performance.
Redo Write Histograms
The log file parallel write
wait event indicates the
amount of time that the database waits on a redo log write. The log file
parallel write
histogram shows the number of times the redo write
completed within a specified time range. Similarly, the redo log write
completions
statistic indicates the amount of time that the storage
server spends processing redo write requests, and the redo log write
completions
histogram shows the number of times the redo write request
was completed within a specified time range. Both histograms are shown in the Redo
Write Histogram section of the AWR report.
A histogram with a significant number of occasional long latencies is said to have a long tail. When both of the histograms in the Redo Write Histogram section of the AWR report have long tails, then this is an indication of slow write times on the storage server, which would warrant further investigation of the other I/O performance statistics. See Monitoring Cell Disk I/O.
If the log file parallel write
histogram has a long
tail that is not present in the redo log write completions
histogram, then the cause is generally not in the storage server, but rather
something else in the I/O path, such as bottlenecks in the network or contention for
compute node CPU.
Skipping
Increased redo write latencies can also occur when the Exadata Smart Flash Log is skipped, and the redo write goes only to disk. Both the AWR report and the storage server metrics show the number of redo log writes that skipped Exadata Smart Flash Log. Skipping may occur when Exadata Smart Flash Log contains too much data that has not yet been written to disk.
There are a few factors that can cause redo writes to skip Exadata Smart Flash Log:
-
Flash disks with high write latencies.
This can be observed in various IO Latency tables located in the Exadata Resource Statistics section of the AWR report, and in the Exadata
CELLDISK
metrics. This can also be identified by checking theFL_FLASH_ONLY_OUTLIERS
metric. If the metric value is high, this indicates a flash disk performance issue. -
Hard disks with high latencies or high utilization.
Prior to Oracle Exadata System Software release 20.1, redo log writes are written to both Exadata Smart Flash Log and hard disk. If the hard disks experience high latencies or high utilization, redo log write performance can be impacted.
This can be observed in various IO Latency tables located in the Exadata Resource Statistics section of the AWR report, and in the Exadata
CELLDISK
metrics. This can also be identified by checking the Outliers columns in the Flash Log section of the AWR report, or theFL_PREVENTED_OUTLIERS
storage server metric. A large number of prevented outliers may indicate that the hard disk writes are taking a long time.In this case, although Exadata Smart Flash Log prevents outliers, overall throughput may be limited due to the queue of redo log data that must be written to disk.
Oracle Exadata System Software release 20.1 adds a further optimization, known as Smart Flash Log Write-Back, that uses Exadata Smart Flash Cache in Write-Back mode instead of disk storage, thereby eliminating the hard disks as a potential performance bottleneck. Depending on the system workload, this feature can improve overall log write throughput by up to 250%.
Parent topic: Monitoring Exadata Smart Flash Log
6.3.5 Monitoring XRMEM Log
Note:
Oracle Exadata System Software release 23.1.0 introduces Exadata RDMA Memory (XRMEM). XRMEM represents the collection of Exadata software technologies that provide direct access to storage server memory using Remote Direct Memory Access (RDMA), enabling faster response times and lower read latencies. In this release, the persistent memory commit accelerator, previously known as PMEM log, is now called XRMEM Log.
Redo log writes are critical database operations and need to complete in a timely manner to prevent load spikes or stalls. Exadata Smart Flash Log is designed to prevent redo write latency outliers. XRMEM log helps to further reduce redo log write latency by using Persistent Memory (PMEM) and Remote Direct Memory Access (RDMA). XRMEM Log is available only in selected Exadata X8M and X9M storage server models.
With XRMEM Log, database clients send I/O buffers directly to PMEM on the
storage servers using RDMA, thereby reducing transport latency. The cell server
(cellsrv
) then writes the redo to Exadata Smart Flash Log (if enabled) and disk at a later time.
Reduced redo log write latency improves OLTP performance, resulting in higher transaction throughput. In cases where XRMEM log is bypassed, Exadata Smart Flash Log can still be used.
- Monitoring XRMEM Log Using Database Statistics and Wait Events
- What to Look For When Monitoring XRMEM Log
Parent topic: Monitoring Oracle Exadata System Software Components
6.3.5.1 Monitoring XRMEM Log Using Database Statistics and Wait Events
The following table describes database statistics that are useful for
monitoring XRMEM log performance. The statistics are available in various dynamic
performance views, including V$SYSSTAT
, and may be displayed in the
Global Activity Statistics or Instance Activity Statistics section of an AWR report.
Statistic | Description |
---|---|
|
The number of redo log write requests that used XRMEM log The statistic is named |
The following table describes database wait events that are useful for
monitoring XRMEM log performance. The wait events are visible in various dynamic
performance views, including V$SESSION
, V$SYSTEM_EVENT
and V$SESSION_EVENT
, and may be displayed in the Wait Event sections of
the AWR report.
Wait Event | Description |
---|---|
log file parallel write |
The Oracle Database log writer (LGWR) process waits on this event when it is waiting for the completion of writes to the redo log file. Efficient use of XRMEM log is indicated by comparatively lower time waiting on this event. |
The availability of a specific statistic or wait event is subject to the version of Oracle Database being used.
Parent topic: Monitoring XRMEM Log
6.3.5.2 What to Look For When Monitoring XRMEM Log
General Performance
Performance issues related to redo logging typically exhibit high latency
for the log file sync
wait event in the Oracle Database user and
foreground processes, with corresponding high latency for log file parallel
write
in the Oracle Database log writer (LGWR) process. Because of the
performance-critical nature of redo log writes, occasional long latencies for
log file parallel write
may cause fluctuations in database
performance, even if the average log file parallel write
wait time
is acceptable.
If any of these are occurring, then it may be indicative of an issue with XRMEM log performance.
Bypassing XRMEM Log
Increased redo write latencies when using XRMEM log can occur when XRMEM
log is bypassed. When XRMEM log is bypassed, the request is sent to the
cellsrv
, and Exadata Smart Flash Log is still used (if available). However, when the bypass request is sent,
it has to ensure that there is no conflict with a previous XRMEM log request. This
conflict checking, which requires scanning XRMEM log, makes bypass writes more
expensive to process, and can result in higher than expected redo log write
latencies.
There are several possible causes that result in a small number of XRMEM log bypasses. Under normal circumstances, the number of bypasses should be substantially less than 1% of the total number of XRMEM log requests. A high number of XRMEM log bypasses is likely to be a symptom of another problem, such as congestion on the RoCE Network Fabric.
Parent topic: Monitoring XRMEM Log
6.3.6 Monitoring PMEM Log
Note:
This topic applies only to Oracle Exadata System Software releases before 23.1.0. Otherwise, see Monitoring XRMEM Log.
Redo log writes are critical database operations and need to complete in a timely manner to prevent load spikes or stalls. Exadata Smart Flash Log is designed to prevent redo write latency outliers. PMEM Log helps to further reduce redo log write latency by using Persistent Memory (PMEM) and Remote Direct Memory Access (RDMA). PMEM Log is available only in selected Exadata X8M and X9M storage server models.
With PMEM Log, database clients send I/O buffers directly to PMEM on the storage servers
using RDMA, thereby reducing transport latency. The cell server
(cellsrv
) then writes the redo to Exadata Smart Flash Log (if enabled) and disk at a later time.
Reduced redo log write latency improves OLTP performance, resulting in higher transaction throughput. In cases where PMEM Log is bypassed, Exadata Smart Flash Log can still be used.
- Monitoring PMEM Log Using Database Statistics and Wait Events
- What to Look For When Monitoring PMEM Log
Parent topic: Monitoring Oracle Exadata System Software Components
6.3.6.1 Monitoring PMEM Log Using Database Statistics and Wait Events
The following table describes database statistics that are useful for
monitoring PMEM Log performance. The statistics are available in various dynamic
performance views, including V$SYSSTAT
, and may be displayed in the
Global Activity Statistics or Instance Activity Statistics section of an AWR report.
Statistic | Description |
---|---|
|
The number of redo log write requests that used PMEM log The statistic is named |
The following table describes database wait events that are useful for
monitoring PMEM Log performance. The wait events are visible in various dynamic
performance views, including V$SESSION
, V$SYSTEM_EVENT
and V$SESSION_EVENT
, and may be displayed in the Wait Event sections of
the AWR report.
Wait Event | Description |
---|---|
log file parallel write |
The Oracle Database log writer (LGWR) process waits on this event when it is waiting for the completion of writes to the redo log file. Efficient use of PMEM Log is indicated by comparatively lower time waiting on this event. |
The availability of a specific statistic or wait event is subject to the version of Oracle Database being used.
Parent topic: Monitoring PMEM Log
6.3.6.2 What to Look For When Monitoring PMEM Log
General Performance
Performance issues related to redo logging typically exhibit high latency
for the log file sync
wait event in the Oracle Database user and
foreground processes, with corresponding high latency for log file parallel
write
in the Oracle Database log writer (LGWR) process. Because of the
performance-critical nature of redo log writes, occasional long latencies for
log file parallel write
may cause fluctuations in database
performance, even if the average log file parallel write
wait time
is acceptable.
If any of these are occurring, then it may be indicative of an issue with PMEM Log performance.
Bypassing PMEM Log
Increased redo write latencies when using PMEM Log can occur when PMEM
Log is bypassed. When PMEM Log is bypassed, the request is sent to the
cellsrv
, and Exadata Smart Flash Log is still used (if available). However, when the bypass request is sent,
it has to ensure that there is no conflict with a previous PMEM Log request. This
conflict checking, which requires scanning PMEM Log, makes bypass writes more
expensive to process, and can result in higher than expected redo log write
latencies.
There are several possible causes that result in a small number of PMEM Log bypasses. Under normal circumstances, the number of bypasses should be substantially less than 1% of the total number of PMEM Log requests. A high number of PMEM Log bypasses is likely to be a symptom of another problem, such as congestion on the RoCE Network Fabric.
Parent topic: Monitoring PMEM Log
6.3.7 Monitoring Smart I/O
Exadata Smart Scan offloads data search and retrieval processing to the Exadata storage servers. It is able to evaluate database predicates on the storage server to optimize efficiency and performance, especially for large queries and certain classes of bulk data processing.
Smart Scan also uses storage indexes, which are maintained automatically on the storage servers, to further optimize filtering by eliminating unnecessary I/O.
Columnar cache is another smart I/O optimization. The columnar cache is a section of Exadata Smart Flash Cache that stores data in columnar format. The columnar cache is automatically populated and maintained by the storage server when a Smart Scan is performed.
- Monitoring Smart I/O Using AWR
- Monitoring Smart I/O Using Database Statistics and Wait Events
- Monitoring Smart I/O Using SQL Monitor
- Monitoring Smart I/O Using SQL Explain Plan
- Monitoring Smart I/O Using Exadata Metrics
- What to Look For When Monitoring Smart I/O
Related Topics
Parent topic: Monitoring Oracle Exadata System Software Components
6.3.7.1 Monitoring Smart I/O Using AWR
The Smart Scan Summary is part of the Performance Summary section of the AWR report. This area contains a high-level summary of key statistics associated with smart I/O. The following example highlights a situation where some activity reverted to block I/O because of ongoing online encryption.
Figure 6-22 AWR Report: Performance Summary - Smart Scan Summary

The Smart IO Statistics section of the AWR report contains more detailed summarized statistical information for smart I/O operations on the whole system and on each storage server. In particular, it tracks the amount of I/O that is eligible for optimization using Smart Scan, the savings from using storage indexes and columnar cache, the amount of data in Exadata Smart Flash Cache, and the amount of data filtered by the storage servers without transportation back to the database server. It also shows associated database statistics and more detailed information about situations where the storage servers are unable to perform predicate evaluation (also known as 'passthru').
Figure 6-23 AWR Report: Smart IO Statistics

Related Topics
Parent topic: Monitoring Smart I/O
6.3.7.2 Monitoring Smart I/O Using Database Statistics and Wait Events
The following table describes various database statistics that are useful for
monitoring smart I/O operations. The statistics are available in various dynamic
performance views, including V$SYSSTAT
, and may be displayed in the
Global Activity Statistics or Instance Activity Statistics section of an AWR report.
Statistic | Description |
---|---|
cell IO uncompressed bytes |
The total size of uncompressed data that is processed on the cell. For operations on segments compressed using Exadata Hybrid Columnar Compression, this statistic is the size of the data after decompression. |
cell num bytes in block IO during predicate
offload |
The number of bytes that were not offloaded because the client was in block I/O mode. |
cell num bytes in filter passthru due to low
mem |
The number of bytes that were not offloaded and sent back to the database for processing due to a low memory on the cell. |
cell num bytes in filter passthru due to subheap
size limit exc |
The number of bytes that were not offloaded and sent back to the database for processing due to a memory limit on the cell. |
cell num bytes in passthru due to quarantine |
The number of bytes that were not offloaded and sent back to the database for processing due to a quarantine on the cell. |
cell num bytes in passthru during predicate
offload |
The number of bytes that were not offloaded and sent back to the database for processing. |
cell num smart IO sessions in rdbms block IO due
to big payload
|
The number of sessions in block I/O mode (not offloaded) due to excessively large metadata. |
cell num smart IO sessions in rdbms block IO due
to no cell mem |
The number of sessions in block I/O mode (not offloaded) due to memory shortage on the storage servers. |
cell num smart IO sessions in rdbms block IO due
to online encr |
The number of sessions in block I/O mode (not offloaded) due to an ongoing online encryption operation. |
|
The number of sessions in block I/O mode (not offloaded) due to a failure in opening a connection to a cell. |
|
The number of sessions in block I/O mode (not offloaded) due to a user setting. |
cell num smart IO sessions using passthru mode
due to cellsrv |
The number of sessions in passthru mode (not offloaded) due to an issue with CELLSRV. |
|
The number of sessions in passthru mode (not offloaded) due to an ongoing database timezone upgrade operation. |
cell num smart IO sessions using passthru mode
due to user |
The number of sessions in passthru mode (not offloaded) due to a user setting. |
cell physical IO bytes added to storage
index |
The number of bytes added to the storage index during a Smart Scan. This is an indication that the storage index is being built. |
cell physical IO bytes eligible for predicate
offload |
The number of bytes on-disk eligible for predicate offload. |
cell physical IO bytes eligible for smart
IOs |
The number of actual bytes eligible for predicate offload. For example, when using columnar cache, this is the size of columnar cache instead of the on-disk size. |
cell physical IO bytes processed for IM
capacity |
The number of bytes read from the columnar cache in
memcompress for capacity format.
|
cell physical IO bytes processed for IM
query |
The number of bytes read from the columnar cache in
memcompress for query format.
|
cell physical IO bytes processed for no
memcompress |
The number of bytes read from the columnar cache in no
memcompress format.
|
cell physical IO bytes saved by columnar
cache |
The number of bytes saved by columnar cache; that is, the number of bytes of reading that was avoided. |
cell physical IO bytes saved by storage
index |
The number of bytes saved by storage index. |
cell physical IO bytes saved during optimized file
creation |
The number of I/O bytes saved by the database host by offloading the file creation operation to the cells. This statistic shows the benefit of optimized file creation operations. |
cell physical IO bytes saved during optimized RMAN
restore |
The number of I/O bytes saved by the database host by offloading the RMAN file restore operation to the cells. This statistic shows the benefit of optimized RMAN file restore operations. |
cell physical IO bytes sent directly to DB node to balance
CPU usage |
The number of I/O bytes sent back to the database server for processing due to high storage server CPU usage. |
cell physical IO interconnect bytes |
The number of I/O bytes exchanged over the interconnect between the database host and the cells. |
cell physical IO interconnect bytes returned by smart
scan |
The number of I/O bytes that are returned by the cell for Smart Scan operations. It does not include bytes for other database I/O. |
The following table describes database wait events that are useful for
monitoring smart I/O operations. The wait events are visible in various dynamic
performance views, including V$SESSION
, V$SYSTEM_EVENT
and V$SESSION_EVENT
, and may be displayed in the Wait Event sections of
the AWR report.
Wait Event | Description |
---|---|
cell external table smart scan |
This wait event appears when the database is waiting for an external table scan on a cell. The cell hash number in the |
cell smart file creation |
This wait event appears when the database is waiting for the completion of a file creation on a cell. The cell hash number in the |
cell smart incremental backup |
This wait event appears when the database is waiting for the completion of an incremental backup on a cell. The cell hash number in the |
cell smart index scan |
This wait event appears when the database is waiting for an index fast full scan. The cell hash number in the |
cell smart index scan: db timezone
upgrade |
The wait event appears when the cells are unable to offload because a database timezone upgrade is in progress. |
cell smart index scan: disabled by
user |
The wait event appears when the cells are unable to offload due to a user setting. |
cell smart index scan: pass
through |
The wait event appears when the cells are unable to offload the Smart Scan. |
cell smart index scan request |
This is a placeholder wait event associated with
After the wait event ends, the placeholder is typically
converted to |
cell smart restore from backup |
This wait event appears when the database is waiting for the completion of a file initialization for restore from backup on a cell. The cell hash number in the |
cell smart table scan |
This wait event appears when the database is waiting for smart scans to complete on a cell. The cell hash number in the |
cell smart table scan: db timezone
upgrade |
The wait event appears when the cells are unable to offload because a database timezone upgrade is in progress. |
cell smart table scan: disabled by
user |
The wait event appears when the cells are unable to offload due to a user setting. |
cell smart table scan: pass
through |
The wait event appears when the cells are unable to offload the Smart Scan. |
cell smart table scan request |
This is a placeholder wait event associated with
After the wait event ends, the placeholder is typically
converted to |
The availability of a specific statistic or wait event is subject to the version of Oracle Database being used.
Parent topic: Monitoring Smart I/O
6.3.7.3 Monitoring Smart I/O Using SQL Monitor
In addition to dynamic performance views and wait events, Oracle Database provides SQL monitor, which enables you to monitor the execution of individual SQL statements.
The SQL monitor report includes detailed statistics for the row sources, which includes additional information for smart I/O operations. The row source statistics can be viewed in the Enterprise Manager active SQL monitor report by clicking on the binoculars in the row source. For example:
Figure 6-24 SQL Monitor Report: Smart I/O Row Source Statistics

The following table describes various row source statistics provided by SQL monitor, which are useful for monitoring smart I/O.
Statistic | Description |
---|---|
Filtered bytes |
The number of bytes returned by the cell. |
Cell passthru IO bytes |
The number of bytes that are not offloaded and sent back to the database for processing. |
Cell passthru IO bytes due to quarantine |
The number of bytes that are not offloaded and sent back to the database for processing due to a quarantine on the cell. |
Eligible bytes for smart IO |
The number of actual bytes eligible for predicate offload. For example, when using columnar cache, this is the size of columnar cache instead of the on-disk size. |
SI saved bytes |
The number of bytes saved by storage index; that is, the number of bytes that did not have to be read. |
Columnar cache saved bytes |
The number of bytes saved by columnar cache; that is, the number of bytes that did not have to be read. |
Partial flash cache and disk bytes |
The number of bytes that read from both Exadata Smart Flash Cache and disk. |
Flash cache bytes |
The number of bytes read from Exadata Smart Flash Cache. |
IM Capacity bytes |
The number of bytes read from the columnar cache in
memcompress for capacity format.
|
IM Query bytes |
The number of bytes read from the columnar cache in
memcompress for query format.
|
No memcompress bytes |
The number of bytes read from the columnar cache in
no memcompress format.
|
Bytes added to storage index |
The number of bytes added to the storage index during a Smart Scan. This is an indication that the storage index is being built. |
cell IORM IO requests on flash |
The number of physical I/O requests to flash storage. |
cell IORM wait time on flash (us) |
The amount of time (in microseconds) IORM queued the flash request. The |
cell IORM IO requests on disk |
The number of physical I/O requests to disk storage. |
cell IORM wait time on disk (us) |
The amount of time (in microseconds) IORM queued the disk request. The |
Block IO bytes |
The number of bytes in block I/O mode. |
|
The size of the query metadata sent from the database compute node to a cell. |
Eligible bytes |
The number of bytes on-disk eligible for predicate offload. |
The availability of a specific statistic is subject to the version of Oracle Database being used.
Related Topics
Parent topic: Monitoring Smart I/O
6.3.7.4 Monitoring Smart I/O Using SQL Explain Plan
The SQL EXPLAIN PLAN
command displays information about smart I/O
optimizations in the SQL execution plan.
You can use the EXPLAIN PLAN
command to identify parts of a SQL query
that can be offloaded to the storage server. The database parameter
CELL_OFFLOAD_PLAN_DISPLAY
must be set to AUTO
or
ALWAYS
for the EXPLAIN PLAN
command to display the
smart I/O optimizations in the SQL execution plan.
Example 6-5 Monitoring Smart I/O Using SQL Explain Plan
This example shows how to use the EXPLAIN PLAN
command
to display the smart I/O optimizations in the SQL execution plan.
In the query plan, the TABLE ACCESS STORAGE FULL
operation indicates that the corresponding full table scan is offloaded to the
storage server. The predicate information further describes the query predicates
that are offloaded to the storage server.
SQL> ALTER SESSION SET CELL_OFFLOAD_PLAN_DISPLAY = ALWAYS;
Session altered.
SQL> EXPLAIN PLAN FOR
SELECT t.prod_id, v.exp1, t2_prod_id, t2_amount_sold
FROM sales t, v1 v
WHERE t.prod_id = v.prod_id AND t.cust_id = v.cust_id
AND t.prod_id != 45
AND v.amount_sold * v.quantity_sold > 10000;
Explained.
SQL> SELECT PLAN_TABLE_OUTPUT FROM TABLE(DBMS_XPLAN.DISPLAY());
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------
Plan hash value: 2267424675
--------------------------------------------------
| Id | Operation | Name |
--------------------------------------------------
| 0 | SELECT STATEMENT | |
|* 1 | HASH JOIN | |
|* 2 | HASH JOIN | |
|* 3 | TABLE ACCESS STORAGE FULL| SALES |
|* 4 | TABLE ACCESS STORAGE FULL| SALES |
|* 5 | TABLE ACCESS STORAGE FULL | SALES |
--------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("T"."CUST_ID"="T2"."CUST_ID" AND
"T1"."PROD_ID"="T2"."PROD_ID" AND "T1"."CUST_ID"="T2"."CUST_ID")
2 - access("T"."PROD_ID"="T1"."PROD_ID")
3 - storage("T1"."PROD_ID"<200 AND
"T1"."AMOUNT_SOLD"*"T1"."QUANTITY_SOLD">10000 AND "T1"."PROD_ID"<>45)
filter("T1"."PROD_ID"<200 AND
"T1"."AMOUNT_SOLD"*"T1"."QUANTITY_SOLD">10000 AND "T1"."PROD_ID"<>45)
4 - storage("T"."PROD_ID"<200 AND "T"."PROD_ID"<>45)
filter("T"."PROD_ID"<200 AND "T"."PROD_ID"<>45)
5 - storage("T2"."PROD_ID"<200 AND "T2"."PROD_ID"<>45)
filter("T2"."PROD_ID"<200 AND "T2"."PROD_ID"<>45)
Parent topic: Monitoring Smart I/O
6.3.7.5 Monitoring Smart I/O Using Exadata Metrics
Smart I/O metrics are identified in the METRICCURRENT
,
METRICDEFINITION
, and METRICHISTORY
objects as
having objectType=SMARTIO
.
For cumulative metrics, the metric value for a specific time period can
be determined by subtracting values from different collectionTime
periods.
Example 6-6 Displaying Smart I/O Metric Definitions
This example shows how to display the smart I/O metric definitions that are available in the Oracle Exadata System Software.
CellCLI> LIST METRICDEFINITION ATTRIBUTES NAME,DESCRIPTION WHERE OBJECTTYPE = SMARTIO
SIO_IO_EL_OF "Cumulative number of megabytes eligible for smart IO offload"
SIO_IO_EL_OF_SEC "Number of megabytes per second eligible for smart IO offload"
SIO_IO_OF_RE "Cumulative number of interconnect megabytes returned by smart IO"
SIO_IO_OF_RE_SEC "Number of interconnect megabytes per second returned by smart IO"
SIO_IO_PA_TH "Cumulative number of megabytes of passthru IOs by smart IO"
SIO_IO_PA_TH_SEC "Number of megabytes per second of passthru IOs by smart IO"
SIO_IO_RD_FC "Cumulative number of megabytes read from flash cache by smart IO"
SIO_IO_RD_FC_HD "Cumulative number of megabytes read from both flash cache and hard disk by smart IO"
SIO_IO_RD_FC_HD_SEC "Number of megabytes per second read from both flash cache and hard disk by smart IO"
SIO_IO_RD_FC_SEC "Number of megabytes per second read from flash cache by smart IO"
SIO_IO_RD_HD "Cumulative number of megabytes read from hard disk by smart IO"
SIO_IO_RD_HD_SEC "Number of megabytes per second read from hard disk by smart IO"
SIO_IO_RD_RQ_FC "Cumulative number of read IO requests from flash cache by smart IO"
SIO_IO_RD_RQ_FC_HD "Cumulative number of read IO requests from both flash cache and hard disk by smart IO"
SIO_IO_RD_RQ_FC_HD_SEC "Number of read IO requests per second from both flash cache and hard disk by smart IO"
SIO_IO_RD_RQ_FC_SEC "Number of read IO requests per second from flash cache by smart IO"
SIO_IO_RD_RQ_HD "Cumulative number of read IO requests from hard disk by smart IO"
SIO_IO_RD_RQ_HD_SEC "Number of read IO requests per second from hard disk by smart IO"
SIO_IO_RV_OF "Cumulative number of megabytes sent to DB node to balance CPU by smart IO"
SIO_IO_RV_OF_SEC "Number of megabytes per second sent to DB node to balance CPU by smart IO"
SIO_IO_SI_SV "Cumulative number of megabytes saved by storage index"
SIO_IO_SI_SV_SEC "Number of megabytes per second saved by storage index"
SIO_IO_WR_FC "Cumulative number of megabytes of flash cache population writes by smart IO"
SIO_IO_WR_FC_SEC "Number of megabytes per second of flash cache population writes by smart IO"
SIO_IO_WR_HD "Cumulative number of megabytes written to hard disk by smart IO"
SIO_IO_WR_HD_SEC "Number of megabytes per second written to hard disk by smart IO"
SIO_IO_WR_RQ_FC "Cumulative number of IO requests for flash cache population writes by smart IO"
SIO_IO_WR_RQ_FC_SEC "Number of IO requests per second for flash cache population writes by smart IO"
SIO_IO_WR_RQ_HD "Cumulative number of write IO requests to hard disk by smart IO"
SIO_IO_WR_RQ_HD_SEC "Number of write IO requests per second to hard disk by smart IO"
Columnar cache metrics are identified as having name like
'FC_COL.*'
Example 6-7 Displaying Columnar Cache Metric Definitions
This example shows how to display the columnar cache metric definitions that are available in the Oracle Exadata System Software.
CellCLI> LIST METRICDEFINITION ATTRIBUTES NAME,DESCRIPTION WHERE NAME LIKE 'FC_COL.*'
FC_COL_BYKEEP_USED "Number of megabytes used for keep objects in Columnar FlashCache"
FC_COL_BY_USED "Number of megabytes used in Columnar FlashCache"
FC_COL_IO_BYKEEP_R "Number of megabytes read from Columnar FlashCache for keep objects"
FC_COL_IO_BYKEEP_R_SEC "Number of megabytes read per second from Columnar FlashCache for keep objects"
FC_COL_IO_BY_R "Number of megabytes that were read from Columnar FlashCache"
FC_COL_IO_BY_R_ELIGIBLE "Number of megabytes eligible to read from Columnar FlashCache"
FC_COL_IO_BY_R_ELIGIBLE_SEC "Number of megabytes per second eligible to read from Columnar FlashCache"
FC_COL_IO_BY_R_SEC "Number of megabytes per second that were read from Columnar FlashCache"
FC_COL_IO_BY_SAVED "Number of megabytes saved by reads from Columnar FlashCache"
FC_COL_IO_BY_SAVED_SEC "Number of megabytes saved per second by reads from Columnar FlashCache"
FC_COL_IO_BY_W_POPULATE "Number of megabytes that are population writes into Columnar FlashCache due to read miss"
FC_COL_IO_BY_W_POPULATE_SEC "Number of megabytes per second that are population writes into Columnar FlashCache due to read miss"
FC_COL_IO_RQKEEP_R "Number of requests read for keep objects from Columnar FlashCache"
FC_COL_IO_RQKEEP_R_SEC "Number of requests read per second for keep objects from Columnar FlashCache"
FC_COL_IO_RQ_R "Number of requests that were read from Columnar FlashCache"
FC_COL_IO_RQ_R_ELIGIBLE "Number of reads eligible for Columnar FlashCache"
FC_COL_IO_RQ_R_ELIGIBLE_SEC "Number of reads per second eligible for Columnar FlashCache"
FC_COL_IO_RQ_R_SEC "Number of requests per second that were read from Columnar FlashCache"
FC_COL_IO_RQ_W_POPULATE "Number of requests that are population writes into Columnar FlashCache due to read miss"
FC_COL_IO_RQ_W_POPULATE_SEC "Number of requests per second that are population writes into Columnar FlashCache due to read miss"
Parent topic: Monitoring Smart I/O
6.3.7.6 What to Look For When Monitoring Smart I/O
Smart I/O Not Performing As Expected
Smart I/O operations typically occur when a full table scan or index fast full scan
is done on a row source. If smart I/O operations do not work as expected, then users
tend to experience noticeable increases in query elapsed times. In some cases, the
database shows increased cell smart table scan
wait times. However,
when wait events such as cell multiblock physical read
or
direct path read
are present instead of cell smart
table scan
, this is an indicator that smart I/O operations are not
being performed.
The following are symptoms and reasons that explain why smart I/O operations might not perform as expected:
-
Direct reads are a prerequisite for Smart Scan, and Smart Scan cannot occur without direct reads.
The
cell multiblock physical read
wait event occurs when blocks are read into the buffer cache. A common reason for reading into the buffer cache, rather than using direct reads, is the size of the segment. For small segments, the optimizer favors the buffer cache over direct reads.Also, by design, Oracle Database shared server sessions do not use direct reads. Consequently, serial queries issued by a shared server session are not eligible for Smart Scan. For parallel queries issued in a shared server session, the parallel worker processes can use direct reads (and Smart Scan), but the query blocks the shared server for its entire duration.
-
The
direct path read
wait event occurs when direct reads are performed, but predicates are not offloaded to the storage servers. This may occur when there is a resource shortage on the storage servers. For example, there might be a memory shortage caused by a large number of concurrent parallel queries on the system.Such resource shortages are typically evident in the
ExaWatcher
data. In particular, you can review statistics such asNumber of low memory threshold failures
andNumber of no memory threhsold failures
incellsrvstat
. ExaWatcher also includescellmem
collection, which shows how memory is consumed on the storage servers, and is visible in the ExaWatcher charts produced usingGetExaWatcherResults.sh
.To address this issue, you can review the use of parallel query and potentially reduce the number of active parallel query servers.
-
Predicate offload is not possible when there is uncommitted data. This typically becomes an issue if large batch operations are modifying the data, and you attempt to run a large query against a large amount of uncommitted data.
When Smart Scan encounters uncommitted data, predicate filtering cannot be offloaded to the storage servers, and additional data must be transported back to the database servers, which appears as increased bytes returned by the Smart Scan. Extra processing is also required on the database server to construct a read-consistent copy of the data, which is manifested in the following ways:
- In the best-case scenario, additional
buffer gets
orsession logical reads
are required to construct a read-consistent copy of the data. - If the undo buffers reside in another database instance,
then Oracle RAC-related wait events may also be observed. The Oracle
RAC-related wait events are prefixed by
gc
. - If the undo blocks do not reside in the buffer cache of any
database instance, then additional
cell single block physical read
waits are observed in conjunction with the single block I/O that is required for read-consistency. The additional I/O can significantly impact the performance of the operation.
- In the best-case scenario, additional
-
When the storage server CPU utilization is high, the storage server sends data back to the database for processing, rather than consuming even more storage server CPU to perform predicate evaluation. This is known as 'reverse offload'.
When this occurs, it is evident in the
Reverse Offload
column in the Smart IO section of the AWR report, and the database statisticcell physical IO bytes sent directly to DB node to balance CPU usage
.The high CPU utilization on the storage servers may be due to the type of predicates being offloaded. For example, case-insensitive searches, or the use of
REGEXP_LIKE
utilize more CPU than simpler predicates.Increased storage server CPU and I/O load may also stem from SQL execution plan changes in the database, in which case, reviewing the execution plans and tuning the affected SQL statements may help to resolve the issue.
-
When the storage server is unable to perform predicate evaluation, it will send the data back to the database for processing. This is also known as passthough (or 'passthru'). The following all indicate that passthrough is occurring:
- A large value in the
Passthru
column in the Smart IO section of the AWR report, when compared to eligible bytes. - A large value in the database statistic
cell num bytes in passthru during predicate offload
, when compared tocell physical IO bytes eligible for smart IO
. - A large
Cell passthru IO bytes
value in the SQL monitor row source statistic, compared to theEligible bytes for Smart IO
value.
Possible causes include:
-
Quarantines — To confirm, review the database statistic
cell num bytes in passthru due to quarantine
, or the SQL monitor row source statisticCell passthru IO bytes due to quarantine
. - Database timezone upgrade — Smart Scan is disabled when a database timezone upgrade is ongoing.
Review the database statistic
cell num smart IO sessions using passthru mode due to timezone
from the Global Activity Statistics or Instance Activity Statistics section, or the Passthru Reasons in the Smart IO section of the AWR report. Depending on the database release, you may also observe thecell smart table scan: db timezone upgrade
orcell smart index scan: db timezone upgrade
wait event. - User setting — A user or application may set
cell_offload_processing=false
, which disables smart scans. To confirm, review the database statisticcell num smart IO sessions using passthru mode due to user
from the Global Activity Statistics or Instance Activity Statistics section, or the Passthru Reasons in the Smart IO section of the AWR report. Depending on the database release, you may also observe thecell smart table scan: disabled by user
orcell smart index scan: disabled user
wait event. - Operation cannot be offloaded — There are other reasons why the storage
servers may be unable to perform predicate offload. Instances of this
occurrence would be visible in the database statistic
cell num smart IO sessions using passthru mode due to cellsrv
, or in wait eventscell smart table scan: passthrough
orcell smart index scan: passthrough
. The following section describes the reasons in detail.
- A large value in the
-
In some situations, Oracle Database reverts to block I/O mode for an operation typically performed using smart I/O. When this occurs, it is evident in the database statistic
cell num bytes in block IO during predicate offload
.There are additional related statistics that provide insight into the underlying cause. These statistics have names beginning with
cell num smart IO sessions in rdbms block IO due to
. For example,cell num smart IO sessions in rdbms block IO due to online encr
counts sessions that revert to block I/O mode because of an ongoing online encryption operation that prevents using smart I/O.
Operation Not Being Offloaded
A smart I/O operation cannot be offloaded to the Exadata storage servers in the following cases:
- A scan is performed on a clustered table
- A scan is performed on an index-organized table
- A fast full scan is performed on reverse key indexes
- The table has row dependencies enabled or the
rowscn
is being fetched - The optimizer wants the scan to return rows in
ROWID
order - The command
CREATE INDEX
usingnosort
- A
LONG
column is being selected or queried. - The query contains a compressed or out-of-line
LOB
. An out-of-lineLOB
storesLOB
data apart from the other row data and is typically larger than 4 KB in size. - A
SELECT ... VERSIONS
query is done on a table - A query that has more than 255 columns referenced, and the heap table is uncompressed, or Basic or OLTP compressed. However, such queries on tables compressed using Exadata Hybrid Columnar Compression are offloaded.
- The tablespace is encrypted, and the
CELL_OFFLOAD_DECRYPTION
parameter is set toFALSE
. In order for the Oracle Exadata System Software to perform decryption, Oracle Database needs to send the decryption key to the storage server. This feature is typically disabled, if there are security concerns about keys being shipped across the network to the storage server - The tablespace is not completely stored on Oracle Exadata Storage Server.
- The predicate evaluation is on a virtual column.
- Although offloading is supported for most SQL operators and functions, Oracle Exadata System Software does not support
offloading for some SQL operators and functions. The dynamic performance view
V$SQLFN_METADATA
includes information about whether offloading is supported for a SQL operator or function. If theOFFLOADABLE
column containsYES
, then offloading is supported for the corresponding operator or function.NO
indicates that offloading is not supported for the corresponding operator or function.
Storage Index Not Performing as Expected
The storage index resides in storage server memory. Before the persistent storage
index feature (introduced in Oracle Exadata System Software
release 21.2.0), the storage index is lost whenever
cellsrv
stops and must be rebuilt after
cellsrv
starts. Consequently, on systems not enabled with the
persistent storage index feature, the storage index is not immediately available
every time cellsrv
starts.
On systems enabled with the persistent storage index feature, the storage index
resides in shared memory, which is maintained across cellsrv
restarts. Furthermore, storage index data is automatically saved to persistent
storage during a graceful server shutdown. During server restart, the storage index
is automatically restored but is unavailable during the restoration process.
Finally, the storage index is lost and must be completely rebuilt if the storage
server suffers from an ungraceful shutdown (such as a power outage or kernel
panic).
For segments in an unencrypted tablespace, the storage index is maintained when DML occurs. However, for segments in an encrypted tablespace, DML invalidates the portion of the storage index associated with each changed data chunk (1 MB). The invalidated chunks are rebuilt during the next scan of the segment. However, the overall efficiency of the storage index is not optimal while portions are invalid.
To monitor storage index performance, monitor:
-
The
Storage Index
column in the Smart IO section of the AWR report. -
The database statistic
cell physical IO bytes saved by storage index
. -
The SQL monitor row source statistic
SI saved bytes
.
You may see reduced storage index savings if the storage index is not
yet built (or rebuilt). While the storage index is building, you will see increases
in the database statistic cell physical IO bytes added to storage
index
and the SQL monitor row source statistic Bytes added to
storage index
.
Columnar Cache Not Performing as Expected
Like storage index, if the columnar cache is not yet built (or rebuilt),
you may see reduced savings associated with columnar cache. To monitor columnar
cache performance, monitor the Columnar Cache sections of the AWR report, or the
database statistic cell physical IO bytes saved by columnar cache
,
or the SQL monitor row source statistic Columnar cache saved
bytes
.
Similar to storage index, the columnar cache is rebuilt every time
cellsrv
starts. Consequently, the columnar cache cannot benefit
operations immediately after cellsrv
starts.
The columnar cache is automatically populated and maintained by the storage server
when a Smart Scan is performed. For Smart Scan operations on uncompressed segments,
segments compressed using OLTP compression, and segments compressed using Exadata Hybrid Columnar Compression, data is automatically
converted into the columnar cache format (no memcompress
) as part
of the Smart Scan.
However, if you are using Oracle Database In-Memory, data is rewritten
into Oracle Database In-Memory columnar format (memcompress for
query
or memcompress for capacity
) by using a
background process. Consequently, operations on that data do not benefit from the
Oracle Database In-Memory optimizations until the cache is repopulated. Information
about the population jobs is available in the Columnar Cache Population sections of
the AWR report.
If you are reading less from columnar cache, then it will be evident by
lower values in the database statistics: cell physical IO bytes processed
for IM Query
, cell physical IO bytes processed for IM
Capacity
, or cell physical IO bytes processed for no
memcompress
. The equivalent SQL monitor row source statistics are:
IM Query bytes
, IM Capacity bytes
, and
No memcompress bytes
.
Parent topic: Monitoring Smart I/O
6.3.8 Monitoring I/O Resource Management (IORM)
Exadata I/O Resource Management (IORM) enables multiple databases and pluggable databases (PDBs) to share the same storage while ensuring that I/O resources are allocated appropriately across the various databases that share the system. This prevents one database from utilizing the entire I/O bandwidth and degrading the performance of the other databases. IORM also works in conjunction with Oracle Database Resource Manager to manage I/O resources across multiple consumer groups within a single database.
- Monitoring I/O Resource Management (IORM) Using AWR
- Monitoring I/O Resource Management (IORM) Using Database Statistics
- Monitoring I/O Resource Management (IORM) Using Exadata Metrics
- What to Look For When Monitoring I/O Resource Management (IORM)
Parent topic: Monitoring Oracle Exadata System Software Components
6.3.8.1 Monitoring I/O Resource Management (IORM) Using AWR
The AWR report includes information about the top databases running on the Exadata system. By using this information, you can monitor system-wide I/O resource consumption regardless of the database that hosts the AWR report. Additionally, the AWR report includes per-cell statistics, which enables you to easily determine if there is any skew in the I/O resource usage across the cells.
The AWR report includes:
-
Top Databases by IO Requests — provides statistics for the databases with the greatest number of I/O requests across all cells. It also breaks down the requests by device type (flash or disk) and by size (small or large).
-
Top Databases by Requests - Details — provides additional statistics regarding the I/O requests. Specifically, it shows the average latency and the average queue time for I/O requests. The queue time is the amount of time that an I/O requests spends in relevant I/O queue. Large queue times indicate that IORM is throttling the I/O.
-
Top Databases by IO Requests per Cell — provides statistics for the databases with the greatest number of I/O requests on each cell. It also breaks down the requests by device type (flash or disk) and by size (small or large). By using this information, you can easily determine if any cell behaves differently when servicing the I/O requests from the database(s).
-
Top Databases by IO Requests per Cell - Details — provides additional statistics regarding the I/O requests on each cell. Specifically, it shows the average latency and the average queue time for I/O requests on each cell. By using this information, you can easily determine if IORM behaves differently on any cell.
-
Top Databases by IO Throughput — provides statistics for the databases with the greatest I/O throughput across all cells. It also breaks down the throughput by device type (flash or disk) and by request size (small or large).
-
Top Databases by Throughput per Cell — provides statistics for the databases with the greatest I/O throughput on each cell. It also breaks down the throughput by device type (flash or disk) and by request size (small or large). By using this information, you can easily determine if any cell behaves differently when servicing the I/O from the database(s).
The AWR report may not be show information about all of the databases running on the Exadata system. Rather, it is designed to focus on the databases that are responsible for the majority of I/O on the storage servers. If more detailed information is required, then IORM cell metrics should be used.
For multitenant container databases (CDBs), the statistics in the AWR report include all of the I/Os associated with the database, including all of the pluggable databases (PDBs).
The following example shows the Top Databases by IO Requests and Top Databases by Requests - Details sections in the AWR report. The Top Databases by IO Requests section shows that the DB1 database generates 46% of the I/O requests captured in AWR. It also shows that approximately 10% of the I/O requests associated with the DB1 database are I/O requests to disk devices (51,854,884 disk I/O requests of the 564,616,086 total I/O requests). The Top Databases by Requests - Details section mostly shows minimal IORM wait time (queue time < 1 millisecond) across all databases. The exception is large I/O requests to flash for the DB1 database, which each have an average queue time of 9.12 milliseconds.
Figure 6-25 AWR Report: Top Databases by IO Requests

Parent topic: Monitoring I/O Resource Management (IORM)
6.3.8.2 Monitoring I/O Resource Management (IORM) Using Database Statistics
The following table describes database statistics that are useful for
monitoring I/O Resource Management (IORM). The statistics
are available in various dynamic performance views, including
V$SYSSTAT
, and may be displayed in the Global Activity Statistics or
Instance Activity Statistics section of an AWR report.
Statistic | Description |
---|---|
Session total flash IO requests |
Total number of physical I/O requests on flash. |
Session IORM flash wait time |
Total amount of IORM wait time for flash I/O requests (in microseconds). To determine the average IORM wait time for flash I/O requests,
divide |
PDB total disk IO requests |
Total number of physical I/O requests on disk. |
PDB IORM disk wait time |
Total amount of IORM wait time for disk I/O requests (in microseconds). To determine the average IORM wait time for disk I/O requests, divide
|
The availability of a specific statistic is subject to the version of Oracle Database being used.
Parent topic: Monitoring I/O Resource Management (IORM)
6.3.8.3 Monitoring I/O Resource Management (IORM) Using Exadata Metrics
You can monitor I/O Resource Management (IORM) by using Oracle Exadata System Software metrics.
IORM uses the database unique name, not the database identifier, to collect statistics and display output. Starting with Oracle Exadata System Software release 19.1.0, if you configured ASM-scoped security for the Oracle Automatic Storage Management (Oracle ASM) cluster used by the database, then the database name is prefixed with the Oracle ASM cluster name.
- Monitoring IORM with Database Metrics
Database metrics provide information about the I/O load from each database listed in the IORM interdatabase plan. - Monitoring IORM with PDB Metrics
Pluggable Database (PDB) metrics provide information about the I/O load from each PDB hosted by a container database (CDB) listed in the IORM interdatabase plan. - Monitoring IORM with Consumer Group Metrics
Consumer group metrics provide information about the I/O load from each consumer group specified in a database resource plan. - Monitoring IORM with Category Metrics
Category metrics provide information about the I/O load from each category specified in the current IORM category plan. - Monitoring IORM Utilization
You can use metrics to monitor IORM utilization.
Parent topic: Monitoring I/O Resource Management (IORM)
6.3.8.3.1 Monitoring IORM with Database Metrics
Database metrics provide information about the I/O load from each database listed in the IORM interdatabase plan.
Database metrics are identified in the METRICCURRENT
,
METRICDEFINITION
, and METRICHISTORY
objects as
having objectType=IORM_DATABASE
.
Example 6-8 Displaying Category Metric Definitions
This example shows how to display the database metric definitions that are available in the Oracle Exadata System Software.
CellCLI> LIST METRICDEFINITION ATTRIBUTES NAME,DESCRIPTION WHERE OBJECTTYPE = IORM_DATABASE
DB_FC_BY_ALLOCATED "Number of megabytes allocated in flash cache for this database"
DB_FC_IO_BY_SEC "Number of megabytes of I/O per second for this database to flash cache"
DB_FC_IO_RQ "Number of IO requests issued by a database to flash cache"
DB_FC_IO_RQ_LG "Number of large IO requests issued by a database to flash cache"
DB_FC_IO_RQ_LG_SEC "Number of large IO requests issued by a database to flash cache per second"
DB_FC_IO_RQ_SEC "Number of IO requests issued by a database to flash cache per second"
DB_FC_IO_RQ_SM "Number of small IO requests issued by a database to flash cache"
DB_FC_IO_RQ_SM_SEC "Number of small IO requests issued by a database to flash cache per second"
DB_FD_IO_BY_SEC "Number of megabytes of I/O per second for this database to flash disks"
DB_FD_IO_LOAD "Average I/O load from this database for flash disks"
DB_FD_IO_RQ_LG "Number of large IO requests issued by a database to flash disks"
DB_FD_IO_RQ_LG_SEC "Number of large IO requests issued by a database to flash disks per second"
DB_FD_IO_RQ_SM "Number of small IO requests issued by a database to flash disks"
DB_FD_IO_RQ_SM_SEC "Number of small IO requests issued by a database to flash disks per second"
DB_FD_IO_TM "The cumulative latency of reading or writing blocks by a database from flash disks"
DB_FD_IO_TM_LG "The cumulative latency of reading or writing large blocks by a database from flash disks"
DB_FD_IO_TM_LG_RQ "The rate which is the average latency of reading or writing large blocks per request by a database from flash disks"
DB_FD_IO_TM_RQ "The rate which is the average latency of reading or writing blocks per request by a database from flash disks"
DB_FD_IO_TM_SM "The cumulative latency of reading or writing small blocks by a database from flash disks"
DB_FD_IO_TM_SM_RQ "The rate which is the average latency of reading or writing small blocks per request by a database from flash disks"
DB_FD_IO_UTIL "Percentage of flash resources utilized by this database"
DB_FD_IO_UTIL_LG "Percentage of flash resources utilized by large requests from this database"
DB_FD_IO_UTIL_SM "Percentage of flash resources utilized by small requests from this database"
DB_FD_IO_WT_LG "IORM wait time for large IO requests issued to flash disks by a database"
DB_FD_IO_WT_LG_RQ "Average IORM wait time per request for large IO requests issued to flash disks by a database"
DB_FD_IO_WT_SM "IORM wait time for small IO requests issued to flash disks by a database"
DB_FD_IO_WT_SM_RQ "Average IORM wait time per request for small IO requests issued to flash disks by a database"
DB_FL_IO_BY "The number of MB written to the Flash Log"
DB_FL_IO_BY_SEC "The number of MB written per second to the Flash Log"
DB_FL_IO_RQ "The number of I/O requests issued to the Flash Log"
DB_FL_IO_RQ_SEC "The number of I/O requests per second issued to the Flash Log"
DB_IO_BY_SEC "Number of megabytes of I/O per second for this database to hard disks"
DB_IO_LOAD "Average I/O load from this database for hard disks"
DB_IO_RQ_LG "Number of large IO requests issued by a database to hard disks"
DB_IO_RQ_LG_SEC "Number of large IO requests issued by a database to hard disks per second"
DB_IO_RQ_SM "Number of small IO requests issued by a database to hard disks"
DB_IO_RQ_SM_SEC "Number of small IO requests issued by a database to hard disks per second"
DB_IO_TM_LG "The cumulative latency of reading or writing large blocks by a database from hard disks"
DB_IO_TM_LG_RQ "The rate which is the average latency of reading or writing large blocks per request by a database from hard disks"
DB_IO_TM_SM "The cumulative latency of reading or writing small blocks by a database from hard disks"
DB_IO_TM_SM_RQ "The rate which is the average latency of reading or writing small blocks per request by a database from hard disks"
DB_IO_UTIL_LG "Percentage of disk resources utilized by large requests from this database"
DB_IO_UTIL_SM "Percentage of disk resources utilized by small requests from this database"
DB_IO_WT_LG "IORM wait time for large IO requests issued to hard disks by a database"
DB_IO_WT_LG_RQ "Average IORM wait time per request for large IO requests issued to hard disks by a database"
DB_IO_WT_SM "IORM wait time for small IO requests issued to hard disks by a database"
DB_IO_WT_SM_RQ "Average IORM wait time per request for small IO requests issued to hard disks by a database"
DB_XRM_BY_ALLOCATED "Number of megabytes allocated in XRMEM cache for this database"
Note the following additional details:
-
The database for the metric is specified by the
metricObjectName
attribute of theMETRICCURRENT
andMETRICHISTORY
objects. -
For metrics that relate to I/O load (for example,
DB_FD_IO_LOAD
andDB_IO_LOAD
), see the additional information relating toCD_IO_LOAD
. -
Starting with Oracle Exadata System Software release 19.1.0, if you configured ASM-scoped security for the Oracle Automatic Storage Management (Oracle ASM) cluster used by the database, then the database name is prefixed with the Oracle ASM cluster name.
-
For cumulative metrics, the metric value for a specific time period can be determined by subtracting values from different
collectionTime
periods. -
For rate metrics, the time period for the metric value is over the previous minute.
-
In the metric descriptions, small I/O requests are less than or equal to 128 KB, and large I/O requests are larger than 128 KB.
-
All database cumulative metrics are reset to zero whenever a category, IORM, or any database resource plan is modified.
-
To list the database metric history for an inter-database plan, use the following CellCLI command:
CellCLI> LIST METRICHISTORY WHERE objectType = 'IORM_DATABASE' AND metricValue != 0 ATTRIBUTES name, metricObjectName, metricValue, collectionTime
- For multitenant container databases (CDBs), the database metric
observations include all of the I/Os associated with the database, including all
of the associated pluggable databases (PDBs). For example, the value for
DB_FC_IO_BY_SEC
includes the sum of thePDB_FC_IO_BY_SEC
values for all of the PDBs hosted by the CDB. -
Observations for Oracle ASM and all other databases not listed in the interdatabase plan are grouped together using
_OTHER_DATABASE_
as themetricObjectName
value.
6.3.8.3.2 Monitoring IORM with PDB Metrics
Pluggable Database (PDB) metrics provide information about the I/O load from each PDB hosted by a container database (CDB) listed in the IORM interdatabase plan.
PDB metrics are identified in the
METRICCURRENT
, METRICDEFINITION
, and
METRICHISTORY
objects as having
objectType=IORM_PLUGGABLE_DATABASE
.
Example 6-9 Displaying PDB Metric Definitions
This example shows how to display the PDB metric definitions that are available in the Oracle Exadata System Software.
CellCLI> LIST METRICDEFINITION ATTRIBUTES NAME,DESCRIPTION WHERE OBJECTTYPE = IORM_PLUGGABLE_DATABASE
PDB_FC_BY_ALLOCATED "Number of megabytes allocated in flash cache for this pluggable database"
PDB_FC_IO_BY_SEC "Number of megabytes of I/O per second for this pluggable database to flash cache"
PDB_FC_IO_RQ "Number of IO requests issued by this pluggable database to flash cache"
PDB_FC_IO_RQ_LG "Number of large IO requests issued by this pluggable database to flash cache"
PDB_FC_IO_RQ_LG_SEC "Number of large IO requests issued by this pluggable database to flash cache per second"
PDB_FC_IO_RQ_SEC "Number of IO requests issued by this pluggable database to flash cache per second"
PDB_FC_IO_RQ_SM "Number of small IO requests issued by this pluggable database to flash cache"
PDB_FC_IO_RQ_SM_SEC "Number of small IO requests issued by this pluggable database to flash cache per second"
PDB_FD_IO_BY_SEC "Number of megabytes of I/O per second for this pluggable database to flash disks"
PDB_FD_IO_LOAD "Average I/O load from this pluggable database for flash disks"
PDB_FD_IO_RQ_LG "Number of large IO requests issued by this pluggable database to flash disks"
PDB_FD_IO_RQ_LG_SEC "Number of large IO requests issued by this pluggable database to flash disks per second"
PDB_FD_IO_RQ_SM "Number of small IO requests issued by this pluggable database to flash disks"
PDB_FD_IO_RQ_SM_SEC "Number of small IO requests issued by this pluggable database to flash disks per second"
PDB_FD_IO_TM "The cumulative latency of reading or writing blocks by this pluggable database from flash disks"
PDB_FD_IO_TM_LG "The cumulative latency of reading or writing large blocks by this pluggable database from flash disks"
PDB_FD_IO_TM_LG_RQ "The rate which is the average latency of reading or writing large blocks per request by this pluggable database from flash disks"
PDB_FD_IO_TM_RQ "The rate which is the average latency of reading or writing blocks per request by this pluggable database from flash disks"
PDB_FD_IO_TM_SM "The cumulative latency of reading or writing small blocks by this pluggable database from flash disks"
PDB_FD_IO_TM_SM_RQ "The rate which is the average latency of reading or writing small blocks per request by this pluggable database from flash disks"
PDB_FD_IO_UTIL "Percentage of flash resources utilized by this pluggable database"
PDB_FD_IO_UTIL_LG "Percentage of flash resources utilized by large requests from this pluggable database"
PDB_FD_IO_UTIL_SM "Percentage of flash resources utilized by small requests from this pluggable database"
PDB_FD_IO_WT_LG "IORM wait time for large IO requests issued to flash disks by this pluggable database"
PDB_FD_IO_WT_LG_RQ "Average IORM wait time per request for large IO requests issued to flash disks by this pluggable database"
PDB_FD_IO_WT_SM "IORM wait time for small IO requests issued to flash disks by this pluggable database"
PDB_FD_IO_WT_SM_RQ "Average IORM wait time per request for small IO requests issued to flash disks by this pluggable database"
PDB_IO_BY_SEC "Number of megabytes of I/O per second for this pluggable database to hard disks"
PDB_IO_LOAD "Average I/O load from this pluggable database for hard disks"
PDB_IO_RQ_LG "Number of large IO requests issued by this pluggable database to hard disks"
PDB_IO_RQ_LG_SEC "Number of large IO requests issued by this pluggable database to hard disks per second"
PDB_IO_RQ_SM "Number of small IO requests issued by this pluggable database to hard disks"
PDB_IO_RQ_SM_SEC "Number of small IO requests issued by this pluggable database to hard disks per second"
PDB_IO_TM_LG "The cumulative latency of reading or writing large blocks by this pluggable database from hard disks"
PDB_IO_TM_LG_RQ "The rate which is the average latency of reading or writing large blocks per request by this pluggable database from hard disks"
PDB_IO_TM_SM "The cumulative latency of reading or writing small blocks by this pluggable database from hard disks"
PDB_IO_TM_SM_RQ "The rate which is the average latency of reading or writing small blocks per request by this pluggable database from hard disks"
PDB_IO_UTIL_LG "Percentage of disk resources utilized by large requests from this pluggable database"
PDB_IO_UTIL_SM "Percentage of disk resources utilized by small requests from this pluggable database"
PDB_IO_WT_LG "IORM wait time for large IO requests issued to hard disks by this pluggable database"
PDB_IO_WT_LG_RQ "Average IORM wait time per request for large IO requests issued to hard disks by this pluggable database"
PDB_IO_WT_SM "IORM wait time for small IO requests issued to hard disks by this pluggable database"
PDB_IO_WT_SM_RQ "Average IORM wait time per request for small IO requests issued to hard disks by this pluggable database"
PDB_XRM_BY_ALLOCATED "Number of megabytes allocated in XRMEM cache for this pluggable database"
Note the following additional details:
-
The PDB for the metric is specified by the
metricObjectName
attribute of theMETRICCURRENT
andMETRICHISTORY
objects. The PDB name is a concatenation of the CDB name with the PDB name. -
For metrics that relate to I/O load (for example,
PDB_FD_IO_LOAD
andPDB_IO_LOAD
), see the additional information relating toCD_IO_LOAD
. -
Starting with Oracle Exadata System Software release 19.1.0, if you configured ASM-scoped security for the Oracle Automatic Storage Management (Oracle ASM) cluster used by the database, then the database name is prefixed with the Oracle ASM cluster name.
-
For cumulative metrics, the metric value for a specific time period can be determined by subtracting values from different
collectionTime
periods. -
For rate metrics, the time period for the metric value is over the previous minute.
-
In the metric descriptions, small I/O requests are less than or equal to 128 KB, and large I/O requests are larger than 128 KB.
6.3.8.3.3 Monitoring IORM with Consumer Group Metrics
Consumer group metrics provide information about the I/O load from each consumer group specified in a database resource plan.
Consumer group metrics are identified in the
METRICCURRENT
,
METRICDEFINITION
, and
METRICHISTORY
objects as having
objectType=IORM_CONSUMER_GROUP
.
Example 6-10 Displaying Consumer Group Metric Definitions
This example shows how to display the consumer group metric definitions that are available in the Oracle Exadata System Software.
CellCLI> LIST METRICDEFINITION ATTRIBUTES NAME,DESCRIPTION WHERE OBJECTTYPE = IORM_CONSUMER_GROUP
CG_FC_IO_BY_SEC "Number of megabytes of I/O per second for this consumer group to flash cache"
CG_FC_IO_RQ "Number of IO requests issued by a consumer group to flash cache"
CG_FC_IO_RQ_LG "Number of large IO requests issued by a consumer group to flash cache"
CG_FC_IO_RQ_LG_SEC "Number of large IO requests issued by a consumer group to flash cache per second"
CG_FC_IO_RQ_SEC "Number of IO requests issued by a consumer group to flash cache per second"
CG_FC_IO_RQ_SM "Number of small IO requests issued by a consumer group to flash cache"
CG_FC_IO_RQ_SM_SEC "Number of small IO requests issued by a consumer group to flash cache per second"
CG_FD_IO_BY_SEC "Number of megabytes of I/O per second for this consumer group to flash disks"
CG_FD_IO_LOAD "Average I/O load from this consumer group for flash disks"
CG_FD_IO_RQ_LG "Number of large IO requests issued by a consumer group to flash disks"
CG_FD_IO_RQ_LG_SEC "Number of large IO requests issued by a consumer group to flash disks per second"
CG_FD_IO_RQ_SM "Number of small IO requests issued by a consumer group to flash disks"
CG_FD_IO_RQ_SM_SEC "Number of small IO requests issued by a consumer group to flash disks per second"
CG_FD_IO_TM "The cumulative latency of reading or writing blocks by a consumer group from flash disks"
CG_FD_IO_TM_LG "The cumulative latency of reading or writing large blocks by a consumer group from flash disks"
CG_FD_IO_TM_LG_RQ "The rate which is the average latency of reading or writing large blocks per request by a consumer group from flash disks"
CG_FD_IO_TM_RQ "The rate which is the average latency of reading or writing blocks per request by a consumer group from flash disks"
CG_FD_IO_TM_SM "The cumulative latency of reading or writing small blocks by a consumer group from flash disks"
CG_FD_IO_TM_SM_RQ "The rate which is the average latency of reading or writing small blocks per request by a consumer group from flash disks"
CG_FD_IO_UTIL "Percentage of flash resources utilized by this consumer group"
CG_FD_IO_UTIL_LG "Percentage of flash resources utilized by large requests from this consumer group"
CG_FD_IO_UTIL_SM "Percentage of flash resources utilized by small requests from this consumer group"
CG_FD_IO_WT_LG "IORM wait time for large IO requests issued to flashdisks by a consumer group"
CG_FD_IO_WT_LG_RQ "Average IORM wait time per request for large IO requests issued to flash disks by a consumer group"
CG_FD_IO_WT_SM "IORM wait time for small IO requests issued to flashdisks by a consumer group"
CG_FD_IO_WT_SM_RQ "Average IORM wait time per request for small IO requests issued to flash disks by a consumer group"
CG_IO_BY_SEC "Number of megabytes of I/O per second for this consumer group to hard disks"
CG_IO_LOAD "Average I/O load from this consumer group for hard disks"
CG_IO_RQ_LG "Number of large IO requests issued by a consumer group to hard disks"
CG_IO_RQ_LG_SEC "Number of large IO requests issued by a consumer group to hard disks per second"
CG_IO_RQ_SM "Number of small IO requests issued by a consumer group to hard disks"
CG_IO_RQ_SM_SEC "Number of small IO requests issued by a consumer group to hard disks per second"
CG_IO_TM_LG "The cumulative latency of reading or writing large blocks by a consumer group from hard disks"
CG_IO_TM_LG_RQ "The rate which is the average latency of reading or writing large blocks per request by a consumer group from hard disks"
CG_IO_TM_SM "The cumulative latency of reading or writing small blocks by a consumer group from hard disks"
CG_IO_TM_SM_RQ "The rate which is the average latency of reading or writing small blocks per request by a consumer group from hard disks"
CG_IO_UTIL_LG "Percentage of disk resources utilized by large requests from this consumer group"
CG_IO_UTIL_SM "Percentage of disk resources utilized by small requests from this consumer group"
CG_IO_WT_LG "IORM wait time for large IO requests issued to hard disks by a consumer group"
CG_IO_WT_LG_RQ "Average IORM wait time per request for large IO requests issued to hard disks by a consumer group"
CG_IO_WT_SM "IORM wait time for small IO requests issued to hard disks by a consumer group"
CG_IO_WT_SM_RQ "Average IORM wait time per request for small IO requests issued to hard disks by a consumer group"
Note the following additional details:
-
The consumer group and database for the metric are specified by the
metricObjectName
attribute of theMETRICCURRENT
andMETRICHISTORY
objects. The name is formed by the database name followed by a period (.
) and the consumer group name. For example, for a database namedPRODUCTIONDB
and a consumer group namedOLTP
, themetricObjectName
would bePRODUCTIONDB.OLTP
. -
For metrics that relate to I/O load (for example,
CG_FD_IO_LOAD
andCG_IO_LOAD
), see the additional information relating toCD_IO_LOAD
. -
Starting with Oracle Exadata System Software release 19.1.0, if you configured ASM-scoped security for the Oracle Automatic Storage Management (Oracle ASM) cluster used by the database, then the database name is prefixed with the Oracle ASM cluster name.
-
For cumulative metrics, the metric value for a specific time period can be determined by subtracting values from different
collectionTime
periods. -
For rate metrics, the time period for the metric value is over the previous minute.
-
In the metric descriptions, small I/O requests are less than or equal to 128 KB, and large I/O requests are larger than 128 KB.
-
All consumer group cumulative metrics are reset to zero whenever a category, IORM, or any database resource plan is modified.
-
To list the current metrics for consumer groups, use the following CellCLI command:
CellCLI> LIST METRICCURRENT WHERE objectType = 'IORM_CONSUMER_GROUP' AND metricValue != 0 ATTRIBUTES name, metricObjectName, metricValue, collectionTime
-
For Oracle ASM and all other databases, metrics are only provided for the BACKGROUND and OTHER consumer groups. The BACKGROUND consumer groups are:
-
_ORACLE_BACKGROUND_GROUP_
: High-priority I/O requests from Oracle Database background processes -
_ORACLE_MEDPRIBG_GROUP_
: Medium-priority I/O requests from Oracle Database background processes -
_ORACLE_LOWPRIBG_GROUP_
: Low-priority I/O requests from Oracle Database background processes
-
6.3.8.3.4 Monitoring IORM with Category Metrics
Category metrics provide information about the I/O load from each category specified in the current IORM category plan.
Category metrics are identified in the
METRICCURRENT
, METRICDEFINITION
, and
METRICHISTORY
objects as having
objectType=IORM_CATEGORY
.
The category for the metric is specified by the metricObjectName
attribute of the METRICCURRENT
and
METRICHISTORY
objects.
For cumulative metrics, the metric value for a specific time period
can be determined by subtracting values from different
collectionTime
periods. For rate metrics, the time period
for the metric value is over the previous minute.
Example 6-11 Displaying Category Metric Definitions
This example shows how to display the category metric definitions that are available in the Oracle Exadata System Software.
CellCLI> LIST METRICDEFINITION ATTRIBUTES NAME,DESCRIPTION WHERE OBJECTTYPE = IORM_CATEGORY
CT_FC_IO_BY_SEC "Number of megabytes of I/O per second for this category to flash cache"
CT_FC_IO_RQ "Number of IO requests issued by an IORM category to flash cache"
CT_FC_IO_RQ_LG "Number of large IO requests issued by an IORM category to flash cache"
CT_FC_IO_RQ_LG_SEC "Number of large IO requests issued by an IORM category to flash cache per second"
CT_FC_IO_RQ_SEC "Number of IO requests issued by an IORM category to flash cache per second"
CT_FC_IO_RQ_SM "Number of small IO requests issued by an IORM category to flash cache"
CT_FC_IO_RQ_SM_SEC "Number of small IO requests issued by an IORM category to flash cache per second"
CT_FD_IO_BY_SEC "Number of megabytes of I/O per second for this category to flash disks"
CT_FD_IO_LOAD "Average I/O load from this category for flash disks"
CT_FD_IO_RQ_LG "Number of large IO requests issued by an IORM category to flash disks"
CT_FD_IO_RQ_LG_SEC "Number of large IO requests issued by an IORM category to flash disks per second"
CT_FD_IO_RQ_SM "Number of small IO requests issued by an IORM category to flash disks"
CT_FD_IO_RQ_SM_SEC "Number of small IO requests issued by an IORM category to flash disks per second"
CT_FD_IO_TM "The cumulative latency of reading or writing blocks for this category from flash disks"
CT_FD_IO_TM_LG "The cumulative latency of reading or writing large blocks for this category from flash disks"
CT_FD_IO_TM_LG_RQ "The rate which is the average latency of reading or writing large blocks per request for this category from flash disks"
CT_FD_IO_TM_RQ "The rate which is the average latency of reading or writing blocks per request for this category from flash disks"
CT_FD_IO_TM_SM "The cumulative latency of reading or writing small blocks for this category from flash disks"
CT_FD_IO_TM_SM_RQ "The rate which is the average latency of reading or writing small blocks per request for this category from flash disks"
CT_FD_IO_UTIL "Percentage of flash resources utilized by this category"
CT_FD_IO_UTIL_LG "Percentage of flash resources utilized by large requests from this category"
CT_FD_IO_UTIL_SM "Percentage of flash resources utilized by small requests from this category"
CT_FD_IO_WT_LG "IORM wait time for large IO requests issued to flash disks by an IORM category"
CT_FD_IO_WT_LG_RQ "Average IORM wait time per request for large IO requests issued to flash disks by an IORM category"
CT_FD_IO_WT_SM "IORM wait time for small IO requests issued to flash disks by an IORM category"
CT_FD_IO_WT_SM_RQ "Average IORM wait time per request for small IO requests issued to flash disks by an IORM category"
CT_IO_BY_SEC "Number of megabytes of I/O per second for this category to hard disks"
CT_IO_LOAD "Average I/O load from this category for hard disks"
CT_IO_RQ_LG "Number of large IO requests issued by an IORM category to hard disks"
CT_IO_RQ_LG_SEC "Number of large IO requests issued by an IORM category to hard disks per second"
CT_IO_RQ_SM "Number of small IO requests issued by an IORM category to hard disks"
CT_IO_RQ_SM_SEC "Number of small IO requests issued by an IORM category to hard disks per second"
CT_IO_TM_LG "The cumulative latency of reading or writing large blocks for this category from hard disks"
CT_IO_TM_LG_RQ "The rate which is the average latency of reading or writing large blocks per request for this category from hard disks"
CT_IO_TM_SM "The cumulative latency of reading or writing small blocks for this category from hard disks"
CT_IO_TM_SM_RQ "The rate which is the average latency of reading or writing small blocks per request for this category from hard disks"
CT_IO_UTIL_LG "Percentage of disk resources utilized by large requests from this category"
CT_IO_UTIL_SM "Percentage of disk resources utilized by small requests from this category"
CT_IO_WT_LG "IORM wait time for large IO requests issued to hard disks by an IORM category"
CT_IO_WT_LG_RQ "Average IORM wait time per request for large IO requests issued to hard disks by an IORM category"
CT_IO_WT_SM "IORM wait time for small IO requests issued to hard disks by an IORM category"
CT_IO_WT_SM_RQ "Average IORM wait time per request for small IO requests issued to hard disks by an IORM category"
Note the following additional details:
-
In the metric descriptions, small I/O requests are less than or equal to 128 KB, and large I/O requests are larger than 128 KB.
- The unit of measurement for the wait metrics is
milliseconds. The wait metrics have metric names
starting with
CD_IO_WT_
. -
For metrics that relate to I/O load (for example,
CT_FD_IO_LOAD
andCT_IO_LOAD
), see the additional information relating toCD_IO_LOAD
. -
All category cumulative metrics are reset to zero whenever a category, IORM, or any database resource plan is modified.
-
To list the category metric history for an interdatabase plan, use the following CellCLI command:
CellCLI> LIST METRICHISTORY WHERE objectType = 'IORM_CATEGORY' AND metricValue != 0 ATTRIBUTES name, metricObjectName, metricValue, collectionTime
-
Category metrics are also provided for the following internally-generated and automatically-managed categories:
-
_ASM_
: Oracle ASM-related I/Os -
_ORACLE_BG_CATEGORY_
: High-priority I/Os issued by Oracle Database background processes -
_ORACLE_MEDPRIBG_CATEGORY_
: Medium-priority I/Os issued by Oracle Database background processes -
_ORACLE_LOWPRIBG_CATEGORY_
: Low-priority I/Os issued by Oracle Database background processes
-
6.3.8.3.5 Monitoring IORM Utilization
You can use metrics to monitor IORM utilization.
When OLTP and DSS workloads share Oracle Exadata Storage Servers, IORM determines whether to optimize for low latency or high throughput. To optimize for low latency, the concurrency of large I/O requests is reduced to ensure that I/O bandwidth is not saturated. To optimize for high throughput, each Oracle Exadata Storage Server must handle many concurrent large I/O requests, allowing the storage to be fully utilized while applying optimization algorithms. However, when a cell has many concurrent large I/O requests, average latency may increase because each I/O is queued behind other I/Os.
The utilization metrics for I/O requests from a database, pluggable database (PDB), or consumer group corresponds to the amount of time that the database, PDB, or consumer group utilized the storage server. Large I/O requests utilize the storage server more than small I/O requests. The following are the utilization metrics for determining IORM optimization:
CG_IO_UTIL_LG
CG_IO_UTIL_SM
PDB_IO_UTIL_LG
PDB_IO_UTIL_SM
CT_IO_UTIL_LG
CT_IO_UTIL_SM
DB_IO_UTIL_LG
DB_IO_UTIL_SM
By comparing the I/O resource consumption with the I/O resource allocations, the
database administrator can determine if IORM should be
tuned for latency or throughput, or if a balanced approach is optimal. The IORM metric, IORM_MODE
, shows the mode for
IORM. The metric value ranges between 1 and 3. The
following are the definitions for the values:
Note:
If the current plan has the IORM attributeobjective
set to BASIC
, then IORM_MODE
has no meaning and should be ignored.
- 1 means the cell IORM objective is set to
low_latency
. - 2 means the cell IORM objective is set to
balanced
. - 3 means the cell IORM objective is set to
high_throughput
.
A value in between 1 and 2, or between 2 and 3, indicates that the IORM objective changed in the metric period, and the precise value indicates proximity to a given objective. It is also indicative of a constantly-changing mix of workloads.
6.3.8.4 What to Look For When Monitoring I/O Resource Management (IORM)
I/O Latency
Issues with IORM typically result in increased I/O latency. This is
usually characterized by higher latency in the cell single block physical
read
database wait event, and in some cases the cell smart
table scan
database wait event. If these database wait events are
significant, and there is no corresponding latencies in the storage servers that are
associated with the flash or disk devices, then this may indicate that IORM is
throttling the workload.
To confirm that IORM throttling is occurring:
- In the Top Databases by Requests - Details section of the AWR report, review the Queue Time columns, which show the average amount of time that IORM spent throttling IO requests for the database.
- Review the cell metrics for IORM wait times:
DB_IO_WT_SM_RQ
,DB_IO_WT_LG_RQ
,PDB_IO_WT_SM_RQ
,PDB_IO_WT_LG_RQ
andCG_IO_WT_SM_RQ
.
You can use IORM database statistics and the AWR report to understand the I/O workload as a whole. You can use Exadata metrics to further understand I/O consumption by each category, database, or consumer group. By analyzing statistics and metrics, you can understand which category, database, pluggable database (PDB), or consumer group is not using its resource allocation and which is exceeding its resource allocation.
If the wait times are small or zero, then the plan allocation is sufficient. If the wait times are large, then the plan allocation is insufficient. If the wait times due to high I/O latency result in unacceptable performance, then the IORM plan can be adjusted to give a larger allocation, or additional storage servers may be required to deliver the required I/O resources.
Using IOSTAT
Device utilization and I/O service time
statistics monitored using iostat
(which is collected by
ExaWatcher) are unreliable.
The following is stated in the Linux
man
page for iostat
:
svctm - The average service time (in milliseconds) for I/O requests that were
issued to the device. Warning! Do not trust this field any more. This field will be
removed in a future sysstat version.
Since the utilization computation depends upon the I/O service time, it is also unreliable.
To monitor the actual I/O utilization for a cell disk, database, pluggable database or consumer group, use the corresponding IORM metrics.
Parent topic: Monitoring I/O Resource Management (IORM)
6.3.9 Monitoring Cell Disk I/O
When database performance issues are related to I/O load on the Exadata storage servers, typically there will be increased latencies in the I/O-related wait events, and increased database time in the User I/O or System I/O wait classes. If the increased database latencies are due to the performance of the Exadata storage servers, then the increased latencies will also be visible in the cell-side statistics. If you have comprehensive baseline statistics for periods when the system is performing well, then you can compare the baseline statistics with other observations to identify differences and explain the situation.
For example, if Oracle Database reports increased cell single block physical
read
latencies, you can then check the statistics from the storage servers
and compare them to a baseline to determine if the cause is increased latency on flash
devices, more disk I/O, or some other cause. If the statistics show an increase in disk
I/O requests, that may be related to a change in Exadata Smart Flash Cache, which would prompt you to review the Exadata Smart Flash Cache statistics. On the other
hand, if the statistics show a latency increase for small reads from flash, then this
may be caused by a change in the I/O pattern, and understanding the type of I/O that has
increased (small reads, small writes, large reads, or large writes) can help to drive
further investigation.
Typically, in an Exadata environment, an even workload distribution is expected across all cells and all disks. If one cell or one disk is doing more work than the others, then you should investigate further, as that cell or disk has the potential of slowing down the entire system.
- Monitoring Cell Disk I/O Using AWR
- Monitoring Cell Disk I/O Using Database Statistics and Wait Events
- Monitoring Cell Disk I/O Using Exadata Metrics
- What to Look For When Monitoring Cell Disk I/O
Parent topic: Monitoring Oracle Exadata System Software Components
6.3.9.1 Monitoring Cell Disk I/O Using AWR
The following sections in the Automatic Workload Repository (AWR) report are particularly useful for understanding I/O load on Exadata:
Often, to better understand characteristics about the I/O load, the statistics from these sections can be correlated with other sections in the AWR report.
Disk Activity
The Disk Activity section provides a high-level summary for potential sources of disk activity. The Disk Activity section is located in the AWR report under Exadata Statistics > Performance Summary.
Figure 6-26 AWR Report: Disk Activity

High I/O load or a substantial change in the pattern of disk activity may prompt further investigation. Possible causes include:
- Redo log write — result in disk writes when redo is written to disk. When using Exadata Smart Flash Log, note that redo is written to both Exadata Smart Flash Log and the online redo log file. Also, Oracle Exadata System Software release 20.1 adds a further optimization, known as Smart Flash Log Write-Back, that uses Exadata Smart Flash Cache in Write-Back mode instead of disk storage. For further details, review Database Redo Activity and Smart Flash Log in the AWR report.
- Smart Scans — result in disk reads for requests that are not satisfied using Exadata Smart Flash Cache. These are typically large reads. For further details, review Smart IO in the AWR report.
- Flash Cache misses — result in disk reads when requested data does not exist in Exadata Smart Flash Cache. These are typically small reads. For further details, review Flash Cache Misses in the AWR report.
- Flash Cache read skips — result in disk reads when requested data is not eligible for Exadata Smart Flash Cache. For further details, review Flash Cache User Reads - Skips in the AWR report.
- Flash Cache write skips or Flash Cache LW rejections — results in disk writes when data is not eligible for Exadata Smart Flash Cache. For further details, review Flash Cache User Writes - Skips and Flash Cache User Writes - Large Write Rejects in the AWR report.
- Disk writer writes — result in disk writes when data from Exadata Smart Flash Cache in Write-Back mode is persisted to disk. For further details, review Flash Cache Internal Writes in the AWR report.
- Scrub IO — occurs when Oracle Exadata System Software automatically inspects and repairs the hard disks. Scrub I/O is performed periodically when the hard disks are idle, and mostly results in large disk reads, which should be throttled automatically if the disk becomes I/O bound.
The specific causes listed in this section are subject to the version of Oracle Database being used.
Exadata Resource Statistics
The Exadata Resource Statistics section contains many statistics and is organized
into several sub-sections. Primarily, the statistics enumerate the I/O occurring on
the storage servers using information from the storage server operating system (OS)
and Oracle Exadata System Software. From the OS, it
includes statistics relating to I/Os per second (IOPS), throughput, utilization,
service time and wait time. These statistics are equivalent to the statistics shown
by the iostat
command. From the Oracle Exadata System Software, it includes statistics relating to IOPS,
throughput, and latency, which are also broken down by small reads, small writes,
large reads, and large writes. These statistics are based on the cell disk
metrics.
The statistics are aggregated by device type, and then by cell or disk. The device type aggregation ensures comparison across the same device type, as different device types are expected to have different performance characteristics. The statistics are presented two ways. Firstly, they are presented to enable outlier analysis. They are also organized to show the 'Top N' cells or disks for a specific statistic.
The outlier analysis presentation allows you to quickly see the statistics aggregated across all storage servers, by cell, and by disk. The display also includes the statistical mean, standard deviation, and normal range. The normal range is based on the mean and standard deviation, not the observed low and high values. For cells, the normal range is the range of values that are plus or minus one standard deviation from the mean. For disks, the normal range is the range of values that are plus or minus three standard deviations from the mean. If there are cells or disks that fall outside the normal range, then they are reported as outliers. This simple outlier analysis is designed to highlight areas for further investigation. However, based on the number of cells or disks in use, and the value of the standard deviation, the outlier analysis may not identify outliers in all cases.
The 'Top N' presentation simply shows the top ranked cells or disks for a specific statistic. This presentation enables you to identify cells or disks that perform more or less work than the others. By using this presentation, you can potentially identify outliers that are not identified by the outlier analysis. Also highlighted in these sections are cells or disks that exceed the expected maximum IOPS for the device or the expected maximum throughput for the device.
The following list outlines the sub-sections in the Exadata Resource Statistics section of the AWR report:
- Exadata Outlier Summary — displays a summary of outliers identified in the various outlier sub-sections.
- Exadata OS Statistics Outliers — contains outlier sub-sections for cells and disks based on OS statistics including, IOPS, throughput (MB/s), utilization percentage, service time, wait time, and CPU utilization per cell.
- Exadata Cell Server Statistics — contains outlier sub-sections for cells and disks based on cell disk metrics, including IOPS, throughput (MB/s), and latency. The statistics are further categorized by I/O type; that is, small read, small write, large read, or large write.
- Exadata Outlier Details — displays detailed information for the identified outliers, along with other statistics related to the outlier.
- Exadata OS Statistics Top — contains 'Top N' sub-sections for cells and disks based on OS statistics, including IOPS, latency, and CPU utilization.
- Exadata Cell Server Statistics Top — contains 'Top N' sub-sections for cells and disks based on cell disk metrics, including IOPS, throughput (MB/s), and latency. The statistics are further categorized by I/O type; that is, small read, small write, large read, or large write.
The following example shows two of the Exadata Cell Server Statistics outlier sub-sections in the AWR report. The example highlights that hard disk throughput (IOPS) exceeds the expected maximum, and that a specific disk that is performing substantially more small reads than other disks.
Figure 6-27 AWR Report: Exadata Cell Server IOPS Statistics Outliers

Exadata IO Reasons
When the database sends an I/O request to Exadata, the request is tagged with information that includes the reason for the I/O. This information is aggregated in the Exadata IO Reasons sections of the AWR report and allows you to understand the reasons for performing I/O.
The AWR report contains sub-sections that display the Top IO Reasons by Requests and the Top IO Reasons by MB (throughput). Later versions of the AWR report, further break down the Top IO Reasons by Requests to categorize read requests from flash, write requests to flash, read requests from disk, and write requests to disk.
The following example shows Top IO Reasons by Requests. The example output is typical of a well-performing system, with a high proportion of I/O that is associated with Smart Scan, and similar I/O profiles across all of the storage servers.
Figure 6-28 AWR Report: Top IO Reasons by Request

Internal IO Reasons
If Internal IO is among the Top IO Reasons reported, then the AWR report will include a section that summarizes the Internal IO Reasons:
Figure 6-29 AWR Report: Internal IO Reasons

Possible causes of internal I/O include:
- Disk Writer reads — results in flash reads when the disk writer reads from Exadata Smart Flash Cache in Write-Back mode to persist data to disk. For further details, review Flash Cache Internal Reads in the AWR report.
- Disk Writer writes — results in disk writes when the disk writer persists data to disk from Exadata Smart Flash Cache in Write-Back mode. For further details, review Flash Cache Internal Reads in the AWR report.
- Population — results in flash writes when requested data is read into Exadata Smart Flash Cache. When the data is read from disk, it also populates Exadata Smart Flash Cache. This is often correlated with flash cache misses. For further details, review Flash Cache User Reads and Flash Cache Internal Writes in the AWR report.
- Metadata — results in flash writes when new data is written to Exadata Smart Flash Cache in Write-Back mode. This is often due to first writes. For further details, review Flash Cache User Writes in the AWR report.
The specific causes listed in this section are subject to the version of Oracle Database being used.
Parent topic: Monitoring Cell Disk I/O
6.3.9.2 Monitoring Cell Disk I/O Using Database Statistics and Wait Events
The following table describes various database statistics that are useful for
monitoring cell disk I/O. The statistics are available in various dynamic performance
views, including V$SYSSTAT
, and may be displayed in the Global Activity
Statistics or Instance Activity Statistics section of an AWR report.
Statistic | Description |
---|---|
physical read IO requests |
Number of physical read requests issued by the database. |
physical read requests optimized |
Total number of read requests satisfied by using Exadata Smart Flash Cache, and read requests avoided by using storage index or columnar cache. |
physical read total bytes |
Total amount of I/O bytes for reads issued by the database, whether or not the read was offloaded to the storage servers. |
physical read total bytes optimized |
Total number of bytes read from Exadata Smart Flash Cache, and bytes avoided by using storage index or columnar cache. |
physical read total IO requests |
Total number of read requests issued by the database for all activity including application, backup, recovery, and other utilities. |
physical write IO requests |
Number of physical write requests issued by the database. |
physical write total bytes |
Total number of IO bytes for writes issued by the database for all activity. |
physical write total bytes optimized |
Total number of bytes written to Exadata Smart Flash Cache. These bytes are synchronized to disk in a lazy manner. |
physical write total IO requests |
Total number of write requests issued by the database for all activity, including application, backup, recovery and other utilities. |
Additional database I/O statistics are also contained in the following
V$IOSTAT
views, which are also used in the IOStat sections of the
AWR report.
V$IOSTAT_FILE
— displays information about disk I/O statistics for database files, including data files, temp files, and other types of database files.V$IOSTAT_FUNCTION
— displays disk I/O statistics for database functions, such as the LGWR and DBWR.V$IOSTAT_FUNCTION_DETAIL
— displays disk I/O statistics for database functions, broken down by file type.
The following table describes database wait events that are useful for
monitoring cell disk I/O. The wait events are visible in various dynamic performance
views, including V$SESSION
, V$SYSTEM_EVENT
and
V$SESSION_EVENT
, and may be displayed in the Wait Event sections of
the AWR report.
Wait Event | Description |
---|---|
cell list of blocks physical
read |
This wait event occurs during recovery or during buffer pre-fetching (rather than performing multiple single-block reads). It is used to monitor database blocks that must be changed as part of recovery and are read in parallel for the database. In
This wait event is equivalent to |
cell list of blocks read request |
This is a placeholder wait event associated with
|
cell multiblock physical read |
This wait event represents the time taken to perform all the I/Os for a multi-block database read. In
This wait event is equivalent to |
cell multiblock read request |
This is a placeholder wait event associated with
|
cell single block physical read |
This wait event represents the time taken to perform a single block database I/O. In
This wait event is equivalent to Commencing with the May 2022 Oracle Database release updates (versions 19.15.0.0.220419, 21.6.0.0.220419, and later), this wait event no longer includes I/O from Exadata Smart Flash Cache or database I/O using a Remote Direct Memory Access (RDMA) read. |
cell single block physical read: flash
cache |
This wait event represents the time taken to perform a single block database I/O from Exadata Smart Flash Cache. In
This wait event was introduced in the May 2022 Oracle
Database release updates and is present in Oracle Database versions
19.15.0.0.220419, 21.6.0.0.220419, and later. Previously,
|
|
This wait event represents the time taken to perform a single block database I/O from XRMEM cache. In
The wait event is named |
cell single block physical read:
RDMA |
This wait event represents the time taken to perform a single block database I/O using a Remote Direct Memory Access (RDMA) read. In
This wait event was introduced in the May 2022 Oracle
Database release updates and is present in Oracle Database versions
19.15.0.0.220419, 21.6.0.0.220419, and later. Previously,
|
cell single block read request |
This is a placeholder wait event associated with a single
block database I/O that is visible only during the wait period.
After the wait event ends, the placeholder is converted to the
appropriate wait event, which is typically one of the |
cell interconnect retransmit during physical
read |
This wait event appears during retransmission for an I/O
of a single-block or multiblock read. The cell hash number in the
|
The availability of a specific statistic or wait event is subject to the version of Oracle Database being used.
Parent topic: Monitoring Cell Disk I/O
6.3.9.3 Monitoring Cell Disk I/O Using Exadata Metrics
Cell disk metrics provide information about the I/O load for cell disks, such as the number of large blocks read from a cell disk.
Cell disk metrics are identified in the METRICCURRENT
,
METRICDEFINITION
, and METRICHISTORY
objects as
having objectType=CELLDISK
.
The cell disk for the metric is specified by the
metricObjectName
attribute of the
METRICCURRENT
and METRICHISTORY
objects.
For cumulative metrics, the metric value for a specific time period is
determined by subtracting values from different collectionTime
periods. For rate metrics, the time period for the metric value is over the previous
minute.
Example 6-12 Displaying Cell Disk Metric Definitions
This example shows how to display the cell disk metric definitions that are available in the Oracle Exadata System Software.
CellCLI> LIST METRICDEFINITION ATTRIBUTES NAME,DESCRIPTION WHERE OBJECTTYPE = CELLDISK
CD_BY_FC_DIRTY "Number of unflushed megabytes cached in FLASHCACHE on a cell disk"
CD_IO_BY_R_LG "Number of megabytes read in large blocks from a cell disk"
CD_IO_BY_R_LG_SEC "Number of megabytes read in large blocks per second from a cell disk"
CD_IO_BY_R_SCRUB "Number of megabytes read from a cell disk by the scrubbing job"
CD_IO_BY_R_SCRUB_SEC "Number of megabytes read per second from a cell disk by the scrubbing job"
CD_IO_BY_R_SM "Number of megabytes read in small blocks from a cell disk"
CD_IO_BY_R_SM_SEC "Number of megabytes read in small blocks per second from a cell disk"
CD_IO_BY_W_LG "Number of megabytes written in large blocks to a cell disk"
CD_IO_BY_W_LG_SEC "Number of megabytes written in large blocks per second to a cell disk"
CD_IO_BY_W_SM "Number of megabytes written in small blocks to a cell disk"
CD_IO_BY_W_SM_SEC "Number of megabytes written in small blocks per second to a cell disk"
CD_IO_ERRS "Number of IO errors on a cell disk"
CD_IO_ERRS_MIN "Number of IO errors on a cell disk per minute"
CD_IO_ERRS_SCRUB "Number of IO errors hit by the scrubbing job on a cell disk"
CD_IO_LOAD "Average I/O load for the cell disk"
CD_IO_RQ_R_LG "Number of requests to read large blocks from a cell disk"
CD_IO_RQ_R_LG_SEC "Number of requests to read large blocks per second from a cell disk"
CD_IO_RQ_R_SCRUB "Number of requests to read from a cell disk by the scrubbing job"
CD_IO_RQ_R_SCRUB_SEC "Number of requests to read per second from a cell disk by the scrubbing job"
CD_IO_RQ_R_SM "Number of requests to read small blocks from a cell disk"
CD_IO_RQ_R_SM_SEC "Number of requests to read small blocks per second from a cell disk"
CD_IO_RQ_W_LG "Number of requests to write large blocks to a cell disk"
CD_IO_RQ_W_LG_SEC "Number of requests to write large blocks per second to a cell disk"
CD_IO_RQ_W_SM "Number of requests to write small blocks to a cell disk"
CD_IO_RQ_W_SM_SEC "Number of requests to write small blocks per second to a cell disk"
CD_IO_ST_RQ "Average service time per request for small IO requests to a cell disk"
CD_IO_TM_R_LG "Cumulative latency of reading large blocks from a cell disk"
CD_IO_TM_R_LG_RQ "Average latency of reading large blocks per request to a cell disk"
CD_IO_TM_R_SM "Cumulative latency of reading small blocks from a cell disk"
CD_IO_TM_R_SM_RQ "Average latency of reading small blocks per request from a cell disk"
CD_IO_TM_W_LG "Cumulative latency of writing large blocks to a cell disk"
CD_IO_TM_W_LG_RQ "Average latency of writing large blocks per request to a cell disk"
CD_IO_TM_W_SM "Cumulative latency of writing small blocks to a cell disk"
CD_IO_TM_W_SM_RQ "Average latency of writing small blocks per request to a cell disk"
CD_IO_UTIL "Percentage of disk resources utilized for the cell disk"
CD_IO_UTIL_LG "Percentage of disk resources utilized by large requests for the cell disk"
CD_IO_UTIL_SM "Percentage of disk resources utilized by small requests for the cell disk"
Note the following additional details:
-
CD_IO_LOAD
represents the average I/O load for the cell disk.I/O load specifies the length of the disk queue. It is similar to iostat's
avgqu-sz
, but I/O load is a weighted value depending on the type of disk:- For hard disks, a large I/O has three times the weight of a small I/O.
- For flash disks, large and small I/Os have the same weight.
Corresponding metrics are also available for each database (
DB_IO_LOAD
), pluggable database (PDB) (PDB_IO_LOAD
), IORM category (CT_IO_LOAD
), and consumer group (CG_IO_LOAD
). -
CD_IO_ST_RQ
cannot be used for flash devices, as explained in Using IOSTAT. -
CD_IO_UTIL
is similar to%util
ofiostat
for hard disks, but not for flash devices as mentioned in Using IOSTAT. For flash devices, it is a percentage of the maximum I/O bandwidth for your system, as specified in the product data sheet.Because this metric is computed by IORM, it is also available per database, PDB, and consumer group.
- The unit of measurement for the latency metrics is microseconds.
The latency metrics have metric names starting with
CD_IO_TM_
.
Parent topic: Monitoring Cell Disk I/O
6.3.9.4 What to Look For When Monitoring Cell Disk I/O
Imbalances
In an Exadata environment, an even load distribution is expected across all cells or disks. However, there are situations that may cause an imbalance.
A load imbalance may by due to characteristics of the workload, such as:
-
Repeated small-table scans — This is often caused by a table on the right side of a nested loop join. Since the small table may only reside on a few disks or cells, repeated access means reading from a small set of devices, which may be flash devices if the data resides in Exadata Smart Flash Cache. To address the imbalance, you can identify affected SQL statements and review their execution plans. You can also consider the physical organization of the affected segments.
-
Repeated control file reads — Control file reads may be caused by queries against some database dynamic performance views. The control file is small and may only reside on a few disks or cells, so repeated access means reading from a small set of devices, usually flash. Repeated control file reads are evident in the IOStat by File Type section of the AWR report and using the
control file sequential read
wait event.To understand the cause of repeated control file reads, you can use the Exadata IO Reasons section of the AWR report to identify cells servicing many control file reads and correlate this with statistics from the Top Databases section to identify databases issuing the control file reads. You can also review Active Session History (ASH) to identify SQL statements that are experiencing waits for
control file sequential read
.Beware that control file reads may account for a disproportionate amount of small I/O on a lightly-loaded system. In this case, you can safely ignore the imbalance.
-
Repeated reads of a small segment, such as a LOB segment — To address this kind of imbalance, review database statistics for segment, or examine ASH to identify the responsible SQL commands. Then, examine the SQL and the surrounding application logic to see if the repeated reads can be avoided.
-
Repeated ASM metadata access — This is often caused by database queries related to space usage, which require access to ASM metadata. The ASM metadata is small and may only reside on a few disks or cells, so repeated access may show up as an imbalance. This can show up in the Exadata IO Reasons sections of the AWR report with reasons that are prefixed with ASM, such as ASM CACHE IO. You can use the Exadata IO Reasons section to identify affected cells and correlate this with statistics from the Top Databases sections to confirm ASM as the source of the I/O. To address the imbalance, review the need for and frequency of the ASM space usage queries that access the metadata.
Apart from workload-related causes, imbalances may also be caused by:
- Uneven data distribution — Hot spots may occur when
some disks contain more or less data than others. To check
for balanced data distribution, you can query the
V$ASM_DISK_IOSTAT
view before and after running a SQL query that contains large full-table scans. In this case, the statistics in theread
column and theread_bytes
column should be approximately the same for all disks in the disk group. You can also check for balanced data distribution by using the script available in My Oracle Support document 367445.1. - Asymmetric grid disk configurations — If the grid disks are sized differently on different cells, then different amounts of data may reside on each cell, resulting in imbalance due to uneven data distribution. For this reason, symmetric configuration is recommended across all storage servers in a system.
- Failures or recovery from failures — Some processing cannot spread evenly across the remaining disks or cells. For example, if a flash card fails, the cell containing the card has less available flash cache space, and this may show up as an imbalance when the flash cache statistics are compared with the other cells.
An imbalance often shows up in different statistics across the AWR report. By correlating different sections in the AWR report you may gain a deeper understanding of the imbalance.
High I/O Load
When database performance issues are related to I/O load on the Exadata storage servers, typically there will be increased latencies in the I/O related wait events, and increased database time in the user I/O or system I/O wait classes. When dealing with high I/O load, first understand the composition of the I/O load. Based on the composition of the I/O load you my be directed to different statistics to gain a deeper understanding.
In the AWR report, a good starting point is the Disk Activity section, which provides a high-level summary for potential sources of disk activity. The Disk Activity section is located in the AWR report under Exadata Statistics > Performance Summary. See Disk Activity.
Internal I/O
In the AWR report, if internal I/O is reported among the Top IO Reasons, then the AWR report also includes a section that summarizes the Internal IO Reasons. Use this section to understand the composition of internal I/O. Based on the composition of the internal I/O load you my be directed to different statistics to gain a deeper understanding. See Internal IO Reasons.
Parent topic: Monitoring Cell Disk I/O
6.3.10 Monitoring Grid Disks
Grid disk metrics provide information about the I/O load for grid disks, such as the number of large blocks read from a grid disk.
Grid disk metrics are identified in
the METRICCURRENT
,
METRICDEFINITION
, and
METRICHISTORY
objects as having
objectType=GRIDDISK
.
The grid disk for the metric is
specified by the metricObjectName
attribute of the
METRICCURRENT
and METRICHISTORY
objects.
For cumulative metrics, the metric value for a specific time period is
determined by subtracting values from different collectionTime
periods. For
rate metrics, the time period for the metric value is over the previous minute.
Example 6-13 Displaying Grid Disk Metric Definitions
This example shows how to display the grid disk metric definitions that are available in the Oracle Exadata System Software.
CellCLI> LIST METRICDEFINITION ATTRIBUTES NAME,DESCRIPTION WHERE OBJECTTYPE = GRIDDISK
GD_BY_FC_DIRTY "Number of unflushed megabytes cached in FLASHCACHE for a grid disk"
GD_IO_BY_R_LG "Number of megabytes read in large blocks from a grid disk"
GD_IO_BY_R_LG_SEC "Number of megabytes read in large blocks per second from a grid disk"
GD_IO_BY_R_SCRUB "Number of megabytes read from a grid disk by the scrubbing job"
GD_IO_BY_R_SCRUB_SEC "Number of megabytes read per second from a grid disk by the scrubbing job"
GD_IO_BY_R_SM "Number of megabytes read in small blocks from a grid disk"
GD_IO_BY_R_SM_SEC "Number of megabytes read in small blocks per second from a grid disk"
GD_IO_BY_W_LG "Number of megabytes written in large blocks to a grid disk"
GD_IO_BY_W_LG_SEC "Number of megabytes written in large blocks per second to a grid disk"
GD_IO_BY_W_SM "Number of megabytes written in small blocks to a grid disk"
GD_IO_BY_W_SM_SEC "Number of megabytes written in small blocks per second to a grid disk"
GD_IO_ERRS "Number of IO errors on a grid disk"
GD_IO_ERRS_MIN "Number of IO errors on a grid disk per minute"
GD_IO_ERRS_SCRUB "Number of IO errors hit by the scrubbing job on a grid disk"
GD_IO_RQ_R_LG "Number of requests to read large blocks from a grid disk"
GD_IO_RQ_R_LG_SEC "Number of requests to read large blocks per second from a grid disk"
GD_IO_RQ_R_SCRUB "Number of requests to read from a grid disk by the scrubbing job"
GD_IO_RQ_R_SCRUB_SEC "Number of requests to read per second from a grid disk by the scrubbing job"
GD_IO_RQ_R_SM "Number of requests to read small blocks from a grid disk"
GD_IO_RQ_R_SM_SEC "Number of requests to read small blocks per second from a grid disk"
GD_IO_RQ_W_LG "Number of requests to write large blocks to a grid disk"
GD_IO_RQ_W_LG_SEC "Number of requests to write large blocks per second to a grid disk"
GD_IO_RQ_W_SM "Number of requests to write small blocks to a grid disk"
GD_IO_RQ_W_SM_SEC "Number of requests to write small blocks per second to a grid disk"
GD_NVM_READ_RETRIES "Number of read retries for a non-volatile memory (NVM) grid disk"
GD_SP_BY_ALLOCATED "Allocated physical space for grid disk in bytes"
GD_SP_IO_BY_PARTIAL "Bytes returned by partial IOs"
GD_SP_IO_BY_PARTIAL_SEC "Bytes returned by partial IOs per second"
GD_SP_IO_BY_REDIRECTED "Sparse bytes redirected to original data block"
GD_SP_IO_BY_REDIRECTED_SEC "Sparse bytes redirected to original data block per second"
GD_SP_IO_RQ_PARTIAL "Number of IO requests that returned partial data"
GD_SP_IO_RQ_PARTIAL_SEC "Number of IO requests that returned partial data per second"
GD_SP_PRCT_ALLOCATED "Allocated physical space for grid disk by percentage"
Note the following additional details:
-
GD_SP_PRCT_ALLOCATED
tracks the allocated physical space for the grid disk, and is presented as a percentage. By default, a warning alert is generated when space usage is more than 95%, and a critical alert is generated when space usage reaches 99%. To lower the space usage, you can increase the size of the grid disk, or drop some of the data on the grid disk.
Parent topic: Monitoring Oracle Exadata System Software Components
6.3.11 Monitoring Host Interconnect Metrics
Host interconnect metrics provide information about the I/O transmission for hosts that access cell storage.
Host interconnect metrics are identified in the
METRICCURRENT
, METRICDEFINITION
, and
METRICHISTORY
objects as having
objectType=HOST_INTERCONNECT
.
The host interconnect for the metric is specified by the
metricObjectName
attribute of the
METRICCURRENT
and METRICHISTORY
objects. For
cumulative metrics, the metric value for a specific time period can be determined by
subtracting values from different collectionTime
periods. For rate
metrics, the time period for the metric value is over the previous minute.
Example 6-14 Displaying Host Interconnect Metric Definitions
This example shows how to display the host interconnect metric definitions that are available in the Oracle Exadata System Software.
CellCLI> LIST METRICDEFINITION ATTRIBUTES NAME,DESCRIPTION WHERE OBJECTTYPE = HOST_INTERCONNECT
N_MB_DROP "Number of megabytes droped during transmission to a particular host"
N_MB_DROP_SEC "Number of megabytes droped during transmission per second to a particular host"
N_MB_RDMA_DROP "Number of megabytes dropped during RDMA transmission to a particular host"
N_MB_RDMA_DROP_SEC "Number of megabytes dropped during RDMA transmission per second to a particular host"
N_MB_RECEIVED "Number of megabytes received from a particular host"
N_MB_RECEIVED_SEC "Number of megabytes per second received from a particular host"
N_MB_RESENT "Number of megabytes resent to a particular host"
N_MB_RESENT_SEC "Number of megabytes resent per second to a particular host"
N_MB_SENT "Number of megabytes transmitted to a particular host"
N_MB_SENT_SEC "Number of megabytes transmitted per second to a particular host"
N_RDMA_RETRY_TM "Latency of the retry actions during RDMA transmission to a particular host"
Parent topic: Monitoring Oracle Exadata System Software Components
6.4 Using Exadata Monitoring Objects
- Displaying Metrics
You can use the CellCLILIST
command to display and monitor metrics for cell objects. - Monitoring Alerts
You can monitor and receive notifications for alerts. - Displaying Active Requests
Use theLIST ACTIVEREQUEST
command to display the active requests for the storage server. - Using Real-Time Insight
You can use the real-time insight feature to enable real-time monitoring of your Exadata systems using an external metric collection platform.
Parent topic: Monitoring Exadata
6.4.1 Displaying Metrics
You can use the CellCLI LIST
command to display and monitor metrics for cell objects.
You can display metric definitions, current metrics, and metric history using the LIST
command.
- Displaying Metric Definitions
Use theLIST METRICDEFINITION
command to display the metric definitions for a storage server. - Displaying Current Metrics
Use theLIST METRICCURRENT
command to display the current metric values for a storage server. - Displaying Metric History
Use theLIST METRICHISTORY
command to display the metric history for a storage server.
Parent topic: Using Exadata Monitoring Objects
6.4.1.1 Displaying Metric Definitions
Use the LIST METRICDEFINITION
command to display the metric definitions for a storage server.
A metric definition listing shows the configuration of a metric.
Example 6-15 Displaying Metric Definitions
This example shows how to display attributes for the METRICDEFINITION
object.
CellCLI> LIST METRICDEFINITION CL_CPUT DETAIL
CellCLI> LIST METRICDEFINITION WHERE objectType = 'GRIDDISK'
CellCLI> LIST METRICDEFINITION WHERE name LIKE 'CD_IO_RQ.*' -
ATTRIBUTES name, metricType, description
Related Topics
Parent topic: Displaying Metrics
6.4.1.2 Displaying Current Metrics
Use the LIST METRICCURRENT
command to display the current metric values for a storage server.
A current metric listing shows a set of observations on the current value of an individual metric.
Example 6-16 Displaying Current Metric Values
This example shows how to display attributes for the METRICCURRENT
object.
CellCLI> LIST METRICCURRENT CL_TEMP DETAIL
name: CL_TEMP
alertState: normal
collectionTime: 2009-12-17T15:32:25-08:00
metricObjectName: abcd2x3
metricType: Instantaneous
metricValue: 48.0 C
objectType: CELL
CellCLI> LIST METRICCURRENT WHERE objectType = 'CELLDISK' AND -
metricValue != 0 ATTRIBUTES name, metricObjectName, -
metricValue, collectionTime
CD_IO_BY_R_LG CD_00_abcd2x3 1.9 MB 2009-12-17T15:46:52-08:00
CD_IO_BY_R_LG CD_01_abcd2x3 1.0 MB 2009-12-17T15:46:52-08:00
CD_IO_BY_R_LG CD_02_abcd2x3 4.1 MB 2009-12-17T15:46:52-08:00
CD_IO_BY_R_LG CD_03_abcd2x3 9.5 MB 2009-12-17T15:46:52-08:00
CD_IO_BY_R_LG CD_04_abcd2x3 0.1 MB 2009-12-17T15:46:52-08:00
CD_IO_BY_R_LG CD_05_abcd2x3 0.4 MB 2009-12-17T15:46:52-08:00
...
Related Topics
Parent topic: Displaying Metrics
6.4.1.3 Displaying Metric History
Use the LIST METRICHISTORY
command to display the metric history for a storage server.
A metric history listing shows a collection of past individual observations of all metric values.
The retention period for most metric history files is specified by the
metricHistoryDays
cell attribute. The
default retention period is 7 days. You can modify this setting
with the CellCLI ALTER CELL
command.
In addition to the metrics governed by the
metricHistoryDays
cell attribute, a
subset of key metric observations are retained for up to one
year. In all cases, historical metric observations are purged
automatically if the server detects a storage space
shortage.
Example 6-17 Displaying Metric History Values
This example shows various LIST METRICHISTORY
commands.
The following command displays observations in the metric history for
the CD_IO_RQ_R_LG
metric when the metric was in
a critical
alert state. The output contains all
of the metric attributes.
CellCLI> LIST METRICHISTORY CD_IO_RQ_R_LG WHERE alertState='critical' DETAIL
The following command displays a subset of metric attributes for all metric observations that are associated with cell disks. The output is limited to observations with non-zero metric values that were observed after the specified time.
CellCLI> LIST METRICHISTORY WHERE objectType = 'CELLDISK' AND metricValue != 0 -
AND collectionTime > '2009-08-12T09:10:51-07:00' ATTRIBUTES name, -
metricObjectName, metricValue, collectionTime
The following command displays all metric observations within a specified time period that are associated with the specified object types .
CellCLI> LIST METRICHISTORY WHERE objectType LIKE 'CELLDISK|FLASHCACHE|FLASHLOG|SMARTIO|IORM_DATABASE' -
AND collectionTime > '2020-07-15T08:00:00-07:00' -
AND collectionTime < '2020-07-15T09:00:00-07:00'
Related Topics
Parent topic: Displaying Metrics
6.4.2 Monitoring Alerts
You can monitor and receive notifications for alerts.
Alerts represent events of importance occurring within the cell. Typically, alerts indicate that Exadata server functionality is compromised or in danger of failure.
To receive notifications, use the ALTER CELL
command.
- Displaying Alert Definitions
Use theLIST ALERTDEFINITION
command to display the alert definitions for the storage server. - Receiving Alert Notifications
Administrators for Oracle Exadata System Software can receive alert notifications by email or by Simple Network Management Protocol (SNMP) trap alerts. - Displaying Alert History
Use theLIST ALERTHISTORY
command to display the alert history that has occurred on a cell. - Modifying Alert History
Use theALTER ALERTHISTORY
command to update the alert history for the cell.
Related Topics
Parent topic: Using Exadata Monitoring Objects
6.4.2.1 Displaying Alert Definitions
Use the LIST ALERTDEFINITION
command to display the alert definitions for the storage server.
An alert definition provides a definition for every alert that can be produced on a storage server.
Example 6-18 Listing Alert Definition Attributes
This example shows how to display a detailed list of attributes for the alert ADRAlert
.
CellCLI> LIST ALERTDEFINITION ADRAlert DETAIL
name: ADRAlert
alertShortName: ADR
alertSource: "Automatic Diagnostic Repository"
description: "Incident Alert"
Example 6-19 Listing Alert Definition Name and Description Attributes
You can display a list of specified attributes for an alert definition. This example shows how to display the alert name, metric name, and description. The metric name identifies the metric on which the alert is based. ADRAlert
, HardwareAlert
, Stateful_HardwareAlert
, and Stateful_SoftwareAlert
are not based on a metric, and therefore do not have metric names.
CellCLI> LIST ALERTDEFINITION ATTRIBUTES name, metricName, description
ADRAlert "Incident Alert"
HardwareAlert "Hardware Alert"
MetricAlert "Threshold Alert"
SoftwareAlert "Software Alert"
StatefulAlert_CD_IO_ERRS_MIN CD_IO_ERRS_MIN "Threshold Alert"
StatefulAlert_CG_IO_RQ_LG CG_IO_RQ_LG "Threshold Alert"
StatefulAlert_CG_IO_RQ_LG_SEC CG_IO_RQ_LG_SEC "Threshold Alert"
StatefulAlert_CG_IO_RQ_SM CG_IO_RQ_SM "Threshold Alert"
StatefulAlert_CG_IO_RQ_SM_SEC CG_IO_RQ_SM_SEC "Threshold Alert"
StatefulAlert_CG_IO_WT_LG CG_IO_WT_LG "Threshold Alert"
StatefulAlert_CG_IO_WT_LG_RQ CG_IO_WT_LG_RQ "Threshold Alert"
StatefulAlert_CG_IO_WT_SM CG_IO_WT_SM "Threshold Alert"
StatefulAlert_CG_IO_WT_SM_RQ CG_IO_WT_SM_RQ "Threshold Alert"
StatefulAlert_CL_FSUT CL_FSUT "Threshold Alert"
StatefulAlert_CL_MEMUT CL_MEMUT "Threshold Alert"
StatefulAlert_CT_IO_RQ_LG CT_IO_RQ_LG "Threshold Alert"
StatefulAlert_CT_IO_RQ_LG_SEC CT_IO_RQ_LG_SEC "Threshold Alert"
StatefulAlert_CT_IO_RQ_SM CT_IO_RQ_SM "Threshold Alert"
StatefulAlert_CT_IO_RQ_SM_SEC CT_IO_RQ_SM_SEC "Threshold Alert"
StatefulAlert_CT_IO_WT_LG CT_IO_WT_LG "Threshold Alert"
StatefulAlert_CT_IO_WT_LG_RQ CT_IO_WT_LG_RQ "Threshold Alert"
StatefulAlert_CT_IO_WT_SM CT_IO_WT_SM "Threshold Alert"
StatefulAlert_CT_IO_WT_SM_RQ CT_IO_WT_SM_RQ "Threshold Alert"
StatefulAlert_DB_IO_RQ_LG DB_IO_RQ_LG "Threshold Alert"
StatefulAlert_DB_IO_RQ_LG_SEC DB_IO_RQ_LG_SEC "Threshold Alert"
StatefulAlert_DB_IO_RQ_SM DB_IO_RQ_SM "Threshold Alert"
StatefulAlert_DB_IO_RQ_SM_SEC DB_IO_RQ_SM_SEC "Threshold Alert"
StatefulAlert_DB_IO_WT_LG DB_IO_WT_LG "Threshold Alert"
StatefulAlert_DB_IO_WT_LG_RQ DB_IO_WT_LG_RQ "Threshold Alert"
StatefulAlert_DB_IO_WT_SM DB_IO_WT_SM "Threshold Alert"
StatefulAlert_DB_IO_WT_SM_RQ DB_IO_WT_SM_RQ "Threshold Alert"
StatefulAlert_GD_IO_ERRS_MIN GD_IO_ERRS_MIN "Threshold Alert"
Stateful_HardwareAlert "Hardware Stateful Alert"
Stateful_SoftwareAlert "Software Stateful Alert"
Parent topic: Monitoring Alerts
6.4.2.2 Receiving Alert Notifications
Administrators for Oracle Exadata System Software can receive alert notifications by email or by Simple Network Management Protocol (SNMP) trap alerts.
By using the ALTER CELL
command, you can set the alert notification
attributes to configure Oracle Exadata Storage Server to
send notification email messages or SNMP trap alerts.
Note:
The SNMP alerts conform to a MIB (management information base)
which is included with each installation of Oracle Exadata System Software. The MIB file on Oracle Exadata Storage Server is available at
/opt/oracle/cell/cellsrv/deploy/config/cell_alert.mib
.
Related Topics
Parent topic: Monitoring Alerts
6.4.2.3 Displaying Alert History
Use the LIST ALERTHISTORY
command to display the alert history that has occurred on a cell.
Alert history entries are retained for a maximum of 100 days. If the number of alert history entries exceeds 500, then the alert history entries are only retained for 7 days. When stateful alerts are cleared, meaning that the underlying metric, hardware or software condition has returned to normal, then the stateful alert is retained either 100 or 7 days, depending on the number of alert history entries. Stateful alerts that are not cleared are retained, regardless of their age.
Example 6-20 Listing Alert History Attributes
This example shows how to display a detailed list of attributes for alert history entries where the severity
attribute is set to critical
and the examinedBy
attribute has not been set.
CellCLI> LIST ALERTHISTORY WHERE severity = 'critical' AND examinedBy = '' DETAIL
Related Topics
Parent topic: Monitoring Alerts
6.4.2.4 Modifying Alert History
Use the ALTER ALERTHISTORY
command to update the alert history for the cell.
Example 6-21 Altering Alert History Attributes
This example shows how to set the examinedBy
attribute to the user ID of the administrator that examined the alert.
CellCLI> ALTER ALERTHISTORY 1671443714 examinedBy="jdoe"
The examinedBy
attribute is the only ALERTHISTORY
attribute that can be modified.
Related Topics
Parent topic: Monitoring Alerts
6.4.3 Displaying Active Requests
Use the LIST ACTIVEREQUEST
command to display the active requests for the storage server.
To view the ACTIVEREQUEST
attributes, use the DESCRIBE ACTIVEREQUEST
command.
Example 6-22 Listing Active Request Attributes
This example shows how to display a detailed list of attributes for a specified request I/O type.
CellCLI> LIST ACTIVEREQUEST WHERE IoType = 'predicate pushing' DETAIL
Related Topics
Parent topic: Using Exadata Monitoring Objects
6.4.4 Using Real-Time Insight
You can use the real-time insight feature to enable real-time monitoring of your Exadata systems using an external metric collection platform.
- Using Fine-Grained Metrics
- Uploading Metric Observations to a Collection Platform
- Downloading Metric Observations from an Exadata Server
- Tagging Metric Observations
- Understanding the Metric Stream Format
Parent topic: Using Exadata Monitoring Objects
6.4.4.1 Using Fine-Grained Metrics
Traditionally, Exadata metric collections occur at 1-minute intervals. However, real-time monitoring requires more timely metrics. Commencing with Oracle Exadata System Software 22.1.0, you can configure fine-grained metric collection.
Fine-grained metric collection is the foundation for real-time insight. Fine-grained metric collection works in conjunction with an external metric collection platform, where the fine-grained metric observations are collected and processed for visualization. Fine-grained metric observations reside only in server memory and are not persisted on the server. Consequently, no fine-grained metric history is maintained on each server.
Parent topic: Using Real-Time Insight
6.4.4.1.1 Controlling Fine-Grained Metric Collection
The metricFGCollIntvlInSec
attribute controls
fine-grained metric collection.
-
To enable fine-grained metric collection, you must set the collection interval to a value between 1 and 60 seconds.
For example:
CellCLI> ALTER CELL metricFGCollIntvlInSec=1
The
metricFGCollIntvlInSec
setting is related to the automatic upload frequency specified in themetricStreamIntvlInSec
attribute. When automatic metric upload and fine-grained collection are both enabled (metricStreamIntvlInSec>0
andmetricFGCollIntvlInSec>0
),metricStreamIntvlInSec
must be between 5 and 30 timesmetricFGCollIntvlInSec
. For example, ifmetricStreamIntvlInSec=60
, thenmetricFGCollIntvlInSec
must be between 2 and 12. -
To disable fine-grained metric collection on a server, set
metricFGCollIntvlInSec=0
.For example:
CellCLI> ALTER CELL metricFGCollIntvlInSec=0
Fine-grained metric collection can be disabled only when automatic metric upload is disabled (
metricStreamIntvlInSec=0
) or the automatic upload frequency is between 5 and 30 minutes (metricStreamIntvlInSec
is between 300 and 1800).
Parent topic: Using Fine-Grained Metrics
6.4.4.1.2 Customizing Fine-Grained Metric Collection
By default, a set of key performance metrics is automatically enabled for fine-grained collection. But, you can customize fine-grained metric collection by enabling or disabling specific metrics.
-
To enable a metric for fine-grained collection, use the
ALTER METRICDEFINITION
command and specifyfinegrained=enabled
. For example:CellCLI> ALTER METRICDEFINITION N_NIC_KB_TRANS_SEC finegrained=enabled
CellCLI> ALTER METRICDEFINITION N_MB_SENT,N_MB_RECEIVED finegrained=enabled
CellCLI> ALTER METRICDEFINITION finegrained=enabled WHERE name LIKE 'N_NIC.*'
-
To disable a metric for fine-grained collection, use the
ALTER METRICDEFINITION
command and specifyfinegrained=disabled
. For example:CellCLI> ALTER METRICDEFINITION N_MB_SENT finegrained=disabled
-
At any time, to view the metrics enabled for fine-grained collection, use the following command:
CellCLI> LIST METRICDEFINITION WHERE finegrained=enabled
-
At any time, to view all of the metric definition details, including each metric description and whether the metric is enabled for fine-grained collection, use the following command:
CellCLI> LIST METRICDEFINITION DETAIL
Parent topic: Using Fine-Grained Metrics
6.4.4.2 Uploading Metric Observations to a Collection Platform
You can enable an Exadata server to automatically upload (push) metric observations to an external metric collection platform.
- Controlling the Automatic Metric Upload Frequency
- Customizing the Metric Stream
- Configuring the Endpoints for Automatic Metric Upload
Parent topic: Using Real-Time Insight
6.4.4.2.1 Controlling the Automatic Metric Upload Frequency
The metricStreamIntvlInSec
attribute sets the upload
interval (in seconds) for automatic uploads to the metric streaming endpoints
specified by the metricStreamEndPoint
attribute.
-
To enable automatic metric uploads, set the
metricStreamIntvlInSec
attribute to a non-zero value.For example:
CellCLI> ALTER CELL metricStreamIntvlInSec=25
The
metricStreamIntvlInSec
setting is related to the fine-grained collection frequency specified in themetricFGCollIntvlInSec
attribute:-
When automatic metric upload and fine-grained collection are both enabled (
metricStreamIntvlInSec>0
andmetricFGCollIntvlInSec>0
),metricStreamIntvlInSec
must be between 5 and 30 timesmetricFGCollIntvlInSec
. For example, ifmetricFGCollIntvlInSec
is set to 5, thenmetricStreamIntvlInSec
must be between 25 and 150. -
When automatic metric upload is enabled and fine-grained collection is disabled (
metricStreamIntvlInSec>0
andmetricFGCollIntvlInSec=0
), the automatic upload frequency must be between 5 and 30 minutes (metricStreamIntvlInSec
must be between 300 and 1800).
-
-
To disable automatic metric uploads, set
metricStreamIntvlInSec=0
.For example:
CellCLI> ALTER CELL metricStreamIntvlInSec=0
Parent topic: Uploading Metric Observations to a Collection Platform
6.4.4.2.2 Customizing the Metric Stream
By default, a set of key performance metrics is automatically enabled for streaming. But, you can customize the metric stream by enabling or disabling specific metrics.
-
To include a metric in the metric stream, use the
ALTER METRICDEFINITION
command and specifystreaming=enabled
. For example:CellCLI> ALTER METRICDEFINITION N_NIC_KB_TRANS_SEC streaming=enabled
CellCLI> ALTER METRICDEFINITION N_MB_SENT,N_MB_RECEIVED streaming=enabled
CellCLI> ALTER METRICDEFINITION streaming=enabled WHERE name LIKE 'N_NIC.*'
-
To remove a metric from the metric stream, use the
ALTER METRICDEFINITION
command and specifystreaming=disabled
. For example:CellCLI> ALTER METRICDEFINITION N_MB_SENT streaming=disabled
-
At any time, to view the metrics that are included in the metric stream, use the following command:
CellCLI> LIST METRICDEFINITION WHERE streaming=enabled
-
At any time, to view all of the metric definition details, including each metric description and whether the metric is included in the metric stream, use the following command:
CellCLI> LIST METRICDEFINITION DETAIL
Parent topic: Uploading Metric Observations to a Collection Platform
6.4.4.2.3 Configuring the Endpoints for Automatic Metric Upload
You can automatically upload (push) the metric stream to one or more
collection endpoints by setting the metricStreamEndPoint
attribute
as follows:
metricStreamEndPoint[+]=((host="endpoint-URL"[,type="stream-format"][,token="authentication-token"][,{httpProxy|httpsProxy}="proxy-server"])
[,(host="endpoint-URL"[,type="stream-format"][,token="authentication-token"][,{httpProxy|httpsProxy}="proxy-server"])]...)
In the metricStreamEndPoint
definition:
-
host
: Specifies the URL for the collection endpoint. The URL can use HTTP or HTTPS. -
type
: Optionally specifies the format of the stream. Supported values are:-
json
: Provides the stream in a JSON format -
plaintext
: Provides the stream in a plain text format
The default value is
json
. -
-
token
: Optionally specifies the authentication token for the collection endpoint. Consult the metric collection platform for details about generating the token. -
httpProxy
orhttpsProxy
: Optionally specifies a proxy server to facilitate network connectivity to the collection endpoint. A proxy server is required if a firewall resides between the Exadata system and the collection endpoint.
You can use the optional +=
operator to add collection
endpoints to an existing metricStreamEndPoint
definition.
Otherwise, the =
operator overwrites the previous attribute
value.
Example 6-23 Setting Up a JSON Stream
This example shows how to set up a JSON stream. In the example command,
the host
and token
values come from the collection
platform and we assume that network connectivity is through the specified proxy.
CellCLI> ALTER CELL metricStreamEndPoint=((host="https://ingest.stream.example.com/v2/datapoint",type="json",token="wcfA_**********Z58QpKg",httpProxy="www-proxy.example.com:80"))
Example 6-24 Adding a Plain Text Endpoint
This example shows how to add a plain text endpoint to the existing
metricStreamEndPoint
. In the example command, the
host
and token
values come from the collection
platform. The example also assumes that the collection platform is within the
corporate network requiring no network proxy.
CellCLI> ALTER CELL metricStreamEndPoint+=((host="http://idbsrv.example.com:8086/api/v2/write?org=Exadata&bucket=Metrics&precision=ns",type="plaintext",token="6unif**********rOXwtfkG0gWGENyePd6uN6OLR_deTZL4IuG9VTfDWwvpB-QvJcCcFs_NVjmpsyANz0Q8psA=="))
Parent topic: Uploading Metric Observations to a Collection Platform
6.4.4.3 Downloading Metric Observations from an Exadata Server
You can download (pull) the metric stream from an Exadata server by using
the provided REST endpoint. This includes all metrics enabled for streaming
(streaming=enabled
), regardless of whether the system is
enabled for fine-grained metric collection or configured for automatic metric
upload.
On each storage server, the REST endpoint URL is:
https://server-name/metricstream/list?stream=true
On each database server, the endpoint uses port 7879. Consequently, the URL is:
https://server-name:7879/metricstream/list?stream=true
For maximum efficiency, the download interval should be a multiple of the metric
collection interval. If fine-grained metric collection is enabled
(metricFGCollIntvlInSec>0
) on the server, then coordinate
the download frequency with the fine-grained collection interval. Otherwise,
coordinate the download frequency with the standard 1-minute collection
interval.
To facilitate access to the metric stream, you should use a dedicated user account, which only has access to the stream. You can use the following command sequence to appropriately configure a user account in CellCLI, which you can then use for authentication to the REST endpoint. In the command sequence, substitute your own user and role names.
CREATE ROLE metric_collector_role
GRANT PRIVILEGE LIST ON METRICSTREAM ALL ATTRIBUTES WITH ALL OPTIONS TO ROLE metric_collector_role
CREATE USER metric_collector PASSWORD=<password>
GRANT ROLE metric_collector_role TO USER metric_collector
Parent topic: Using Real-Time Insight
6.4.4.4 Tagging Metric Observations
In each Exadata database server and storage server you can define a set of metric tags, which are included in every observation in the metric stream. These tags can help you to organize and group observations generated by numerous Exadata servers.
You can configure metric tags by setting the
metricStreamTags
attribute to a valid JSON string containing
tag and value pairs as follows:
metricStreamTags='{"tag1":"value1"[,"tag2":"value2"]...}'
For example:
CellCLI> ALTER CELL metricStreamTags='{"application":"personnel","department":"HR"}'
Parent topic: Using Real-Time Insight
6.4.4.5 Understanding the Metric Stream Format
Real-time metric observations contain a core set of attributes. However, the format of the metric stream depends on the mode of access.
If you are automatically uploading the metric stream to a metric collection platform, you can specify the metric stream format in the metric endpoint configuration. The available formats are JSON or plain text.
Following is an example of the JSON format:
json: {
"gauge": [{
"metric": "OS_NET_RX_BY_SEC",
"value": "0.0012989044189453125",
"timestamp": 1652473286000,
"unit": "MB/sec",
"dimensions": {
"server": "celadm09.example.com",
"objectName": "eth0",
"nodeType": "STORAGE",
"fleet": "example-fleet",
"pod": "dbm01",
"cluster": "c01"
}
}, {
"metric": "SIO_IO_RD_FC_HD_SEC",
"value": "0.0",
"timestamp": 1652473286000,
"unit": "MB/sec",
"dimensions": {
"server": "celadm09.example.com",
"objectName": "SMARTIO",
"nodeType": "STORAGE",
"fleet": "example-fleet",
"pod": "dbm01",
"cluster": "c01"
}
}
]
}
The plain text format contains essentially the same information as the JSON stream. However, with the plain text format, each metric observation is presented on a separate line. Following is an example of the plain text format:
metrics,name=OS_NET_RX_BY_SEC,objectName=eth0,server=celadm09.example.com,unit=MB/sec,nodeType=STORAGE,fleet=example-fleet,pod=dbm01,cluster=c01 value=9.441184615324398E-4 1652473456000000000
metrics,name=OS_NET_RX_BY_SEC,objectName=eth0,server=celadm09.example.com,unit=MB/sec,nodeType=STORAGE,fleet=example-fleet,pod=dbm01,cluster=c01 value=0.002647613311980988 1652473457000000000
If you are downloading the metric stream by using the provided REST endpoint, the data is presented in a format similar to the plan text upload format where each metric observation is presented on a separate line. Following is an example of the download format:
DS_CPUT{objectName="dbadm05",unit="%",server="dbadm05.example.com",nodeType="KVMHOST",fleet="example-fleet",pod="dbm01",cluster="c01"} 23.10906363831155 1652485449597
DS_MEMUT{objectName="dbadm05",unit="%",server="dbadm05.example.com",nodeType="KVMHOST",fleet="example-fleet",pod="dbm01",cluster="c01"} 99 1652485449597
DS_MEMUT_MS{objectName="dbadm05",unit="%",server="dbadm05.example.com",nodeType="KVMHOST",fleet="example-fleet",pod="dbm01",cluster="c01"} 0.12396045794483294 1652485449597
The following list describes the attributes contained in the metric stream:
-
The metric name is identified as follows:
-
In the JSON upload format, the metric name follows the
metric
tag. -
In the plain text upload format, the metric name is the value following
name=
. -
In the download format, the metric name is the first element on each line, preceding the left braces (
{
).
-
-
The metric value is located as follows:
-
In the JSON upload format, the metric value follows the
value
tag. -
In the plain text upload format, the metric value follows
value=
. -
In the download format, the metric value is the second last element on each line, following the right braces (
}
).
-
-
The time of the metric observation is located as follows:
-
In the JSON upload format, the timestamp follows the
timestamp
tag. The timestamp is expressed as the number of milliseconds (1 x 10-3 sec) since January 1, 1970, 00:00:00 GMT. -
In the plain text upload format, the timestamp is the last element on each line. The timestamp is expressed as the number of nanoseconds (1 x 10-9 sec) since January 1, 1970, 00:00:00 GMT.
-
In the download format, the timestamp is the last element on each line. The timestamp is expressed as the number of milliseconds (1 x 10-3 sec) since January 1, 1970, 00:00:00 GMT.
-
-
The
unit
value describes the unit of measurement for the metric observation. -
The
server
value contains the name of the Exadata server that generated the metric observation. -
The
objectName
value contains the name of the Exadata object associated with the metric. -
The
nodeType
value contains the type of the Exadata server that generated the metric observation. -
The
fleet
,pod
, andcluster
attributes are examples of user-defined metric tags, which you can use to organize and group observations generated by numerous Exadata servers. You can tag metrics by setting themetricStreamTags
CELL attribute.
Related Topics
Parent topic: Using Real-Time Insight