What to Look For When Monitoring Smart I/O

Smart I/O Not Performing As Expected

Smart I/O operations typically occur when a full table scan or index fast full scan is done on a row source. If smart I/O operations do not work as expected, then users tend to experience noticeable increases in query elapsed times. In some cases, the database shows increased cell smart table scan wait times. However, when wait events such as cell multiblock physical read or direct path read are present instead of cell smart table scan, this is an indicator that smart I/O operations are not being performed.

The following are symptoms and reasons that explain why smart I/O operations might not perform as expected:

Direct reads are a prerequisite for Smart Scan, and Smart Scan cannot occur without direct reads.

The cell multiblock physical read wait event occurs when blocks are read into the buffer cache. A common reason for reading into the buffer cache, rather than using direct reads, is the size of the segment. For small segments, the optimizer favors the buffer cache over direct reads.

Also, by design, Oracle Database shared server sessions do not use direct reads. Consequently, serial queries issued by a shared server session are not eligible for Smart Scan. For parallel queries issued in a shared server session, the parallel worker processes can use direct reads (and Smart Scan), but the query blocks the shared server for its entire duration.
The direct path read wait event occurs when direct reads are performed, but predicates are not offloaded to the storage servers. This may occur when there is a resource shortage on the storage servers. For example, there might be a memory shortage caused by a large number of concurrent parallel queries on the system.

Such resource shortages are typically evident in the ExaWatcher data. In particular, you can review statistics such as Number of low memory threshold failures and Number of no memory threhsold failures in cellsrvstat. ExaWatcher also includes cellmem collection, which shows how memory is consumed on the storage servers, and is visible in the ExaWatcher charts produced using GetExaWatcherResults.sh.

To address this issue, you can review the use of parallel query and potentially reduce the number of active parallel query servers.
Predicate offload is not possible when there is uncommitted data. This typically becomes an issue if large batch operations are modifying the data, and you attempt to run a large query against a large amount of uncommitted data.

When Smart Scan encounters uncommitted data, predicate filtering cannot be offloaded to the storage servers, and additional data must be transported back to the database servers, which appears as increased bytes returned by the Smart Scan. Extra processing is also required on the database server to construct a read-consistent copy of the data, which is manifested in the following ways:
- In the best-case scenario, additional buffer gets or session logical reads are required to construct a read-consistent copy of the data.
- If the undo buffers reside in another database instance, then Oracle RAC-related wait events may also be observed. The Oracle RAC-related wait events are prefixed by gc.
- If the undo blocks do not reside in the buffer cache of any database instance, then additional cell single block physical read waits are observed in conjunction with the single block I/O that is required for read-consistency. The additional I/O can significantly impact the performance of the operation.
When the storage server CPU utilization is high, the storage server sends data back to the database for processing, rather than consuming even more storage server CPU to perform predicate evaluation. This is known as 'reverse offload'.

When this occurs, it is evident in the Reverse Offload column in the Smart IO section of the AWR report, and the database statistic cell physical IO bytes sent directly to DB node to balance CPU usage.

The high CPU utilization on the storage servers may be due to the type of predicates being offloaded. For example, case-insensitive searches, or the use of REGEXP_LIKE utilize more CPU than simpler predicates.

Increased storage server CPU and I/O load may also stem from SQL execution plan changes in the database, in which case, reviewing the execution plans and tuning the affected SQL statements may help to resolve the issue.
When the storage server is unable to perform predicate evaluation, it will send the data back to the database for processing. This is also known as passthough (or 'passthru'). The following all indicate that passthrough is occurring:
- A large value in the Passthru column in the Smart IO section of the AWR report, when compared to eligible bytes.
- A large value in the database statistic cell num bytes in passthru during predicate offload, when compared to cell physical IO bytes eligible for smart IO.
- A large Cell passthru IO bytes value in the SQL monitor row source statistic, compared to the Eligible bytes for Smart IO value.
Possible causes include:
- Quarantines — To confirm, review the database statistic cell num bytes in passthru due to quarantine, or the SQL monitor row source statistic Cell passthru IO bytes due to quarantine.
- Database timezone upgrade — In some situations, a query cannot use Smart Scan while a database timezone upgrade is ongoing. Review the database statistic cell num smart IO sessions using passthru mode due to timezone from the Global Activity Statistics or Instance Activity Statistics section, or the Passthru Reasons in the Smart IO section of the AWR report. Depending on the database release, you may also observe the cell smart table scan: db timezone upgrade or cell smart index scan: db timezone upgrade wait event.
- User setting — A user or application may set cell_offload_processing=false, which disables smart scans. To confirm, review the database statistic cell num smart IO sessions using passthru mode due to user from the Global Activity Statistics or Instance Activity Statistics section, or the Passthru Reasons in the Smart IO section of the AWR report. Depending on the database release, you may also observe the cell smart table scan: disabled by user or cell smart index scan: disabled user wait event.
- Operation cannot be offloaded — There are other reasons why the storage servers may be unable to perform predicate offload. Instances of this occurrence would be visible in the database statistic cell num smart IO sessions using passthru mode due to cellsrv, or in wait events cell smart table scan: passthrough or cell smart index scan: passthrough. The following section describes the reasons in detail.
In some situations, Oracle Database reverts to block I/O mode for an operation typically performed using smart I/O. When this occurs, it is evident in the database statistic cell num bytes in block IO during predicate offload.

There are additional related statistics that provide insight into the underlying cause. These statistics have names beginning with cell num smart IO sessions in rdbms block IO due to. For example, cell num smart IO sessions in rdbms block IO due to no cell mem counts sessions that revert to block I/O mode because of a storage server memory shortage that prevents using smart I/O.

Operation Not Being Offloaded

A smart I/O operation cannot be offloaded to the Exadata storage servers in the following cases:

A scan is performed on a clustered table
A fast full scan is performed on reverse key indexes
The table has row dependencies enabled or the rowscn is being fetched
The optimizer wants the scan to return rows in ROWID order
The command CREATE INDEX using nosort
A LONG column is being selected or queried.
The query contains a compressed or out-of-line LOB. An out-of-line LOB stores LOB data apart from the other row data and is typically larger than 4 KB in size.
A SELECT ... VERSIONS query is done on a table
The tablespace is encrypted, and the CELL_OFFLOAD_DECRYPTION parameter is set to FALSE. In order for the Oracle Exadata System Software to perform decryption, Oracle Database needs to send the decryption key to the storage server. This feature is typically disabled, if there are security concerns about keys being shipped across the network to the storage server
The tablespace is not completely stored on Oracle Exadata Storage Server.
The predicate evaluation is on a virtual column.
Although offloading is supported for most SQL operators and functions, Oracle Exadata System Software does not support offloading for some SQL operators and functions. The dynamic performance view V$SQLFN_METADATA includes information about whether offloading is supported for a SQL operator or function. If the OFFLOADABLE column contains YES, then offloading is supported for the corresponding operator or function. NO indicates that offloading is not supported for the corresponding operator or function.
On systems with software versions earlier than Oracle Exadata System Software 24.1.0 and Oracle Database 23ai, offloading is also not available when:
- A scan is performed on an index-organized table
- A query references more than 255 columns from a heap table that is uncompressed or compressed using Basic or OLTP compression. However, offloading is possible for queries on tables compressed using Exadata Hybrid Columnar Compression, even if the query references more than 255 columns.

Storage Index Not Performing as Expected

The storage index resides in storage server memory. Before the persistent storage index feature (introduced in Oracle Exadata System Software release 21.2.0), the storage index is lost whenever cellsrv stops and must be rebuilt after cellsrv starts. Consequently, on systems not enabled with the persistent storage index feature, the storage index is not immediately available every time cellsrv starts.

On systems enabled with the persistent storage index feature, the storage index resides in shared memory, which is maintained across cellsrv restarts. Furthermore, storage index data is automatically saved to persistent storage during a graceful server shutdown. During server restart, the storage index is automatically restored but is unavailable during the restoration process. Finally, the storage index is lost and must be completely rebuilt if the storage server suffers from an ungraceful shutdown (such as a power outage or kernel panic).

For segments in an unencrypted tablespace, the storage index is maintained when DML occurs. However, for segments in an encrypted tablespace, DML invalidates the portion of the storage index associated with each changed data chunk (1 MB). The invalidated chunks are rebuilt during the next scan of the segment. However, the overall efficiency of the storage index is not optimal while portions are invalid.

To monitor storage index performance, monitor:

The Storage Index column in the Smart IO section of the AWR report.
The database statistic cell physical IO bytes saved by storage index.
The SQL monitor row source statistic SI saved bytes.

You may see reduced storage index savings if the storage index is not yet built (or rebuilt). While the storage index is building, you will see increases in the database statistic cell physical IO bytes added to storage index and the SQL monitor row source statistic Bytes added to storage index.

Columnar Cache Not Performing as Expected

Like storage index, if the columnar cache is not yet built (or rebuilt), you may see reduced savings associated with columnar cache. To monitor columnar cache performance, monitor the Columnar Cache sections of the AWR report, or the database statistic cell physical IO bytes saved by columnar cache, or the SQL monitor row source statistic Columnar cache saved bytes.

Similar to storage index, the columnar cache is rebuilt every time cellsrv starts. Consequently, the columnar cache cannot benefit operations immediately after cellsrv starts.

The columnar cache is automatically populated and maintained by the storage server when a Smart Scan is performed. For Smart Scan operations on uncompressed segments, segments compressed using OLTP compression, and segments compressed using Exadata Hybrid Columnar Compression, data is automatically converted into the columnar cache format (no memcompress) as part of the Smart Scan.

However, if you are using Oracle Database In-Memory, data is rewritten into Oracle Database In-Memory columnar format (memcompress for query or memcompress for capacity) by using a background process. Consequently, operations on that data do not benefit from the Oracle Database In-Memory optimizations until the cache is repopulated. Information about the population jobs is available in the Columnar Cache Population sections of the AWR report.

If you are reading less from columnar cache, then it will be evident by lower values in the database statistics: cell physical IO bytes processed for IM Query, cell physical IO bytes processed for IM Capacity, or cell physical IO bytes processed for no memcompress. The equivalent SQL monitor row source statistics are: IM Query bytes, IM Capacity bytes, and No memcompress bytes.

6.3.7.6 What to Look For When Monitoring Smart I/O