1 Introducing Oracle Exadata System Software

This chapter introduces Oracle Exadata System Software.

1.1 Overview of Oracle Exadata System Software

Oracle Exadata Storage Server is a highly optimized storage server that runs Oracle Exadata System Software to store and access Oracle Database data.

With traditional storage, data is transferred to the database server for processing. In contrast, Oracle Exadata System Software provides database-aware storage services, such as the ability to offload SQL and other database processing from the database server, while remaining transparent to the SQL processing and database applications. Oracle Exadata Database Machine storage servers process data at the storage level, and pass only what is needed to the database servers.

Oracle Exadata System Software is installed on both the storage servers and the database servers. Oracle Exadata System Software offloads some SQL processing from the database server to the storage servers. Oracle Exadata System Software enables function shipping between the database instance and the underlying storage, in addition to traditional data shipping. Function shipping greatly reduces the amount of data processing that must be done by the database server. Eliminating data transfers and database server workload can greatly benefit query processing operations that often become bandwidth constrained. Eliminating data transfers can also provide a significant benefit to online transaction processing (OLTP) systems that include large batch and report processing operations.

The hardware components of Oracle Exadata Storage Server are carefully chosen to match the needs of high performance processing. The Oracle Exadata System Software is optimized to maximize the advantage of the hardware components. Each storage server delivers outstanding processing bandwidth for data stored on disk, often several times better than traditional solutions.

Oracle Exadata Database Machine storage servers use state-of-the-art RDMA Network Fabric interconnections between servers and storage. Each RDMA Network Fabric link provides bandwidth of 40 GB for InfiniBand Network Fabric or 100 GB for RoCE Network Fabric. Additionally, the interconnection protocol uses direct data placement, also referred to as direct memory access (DMA), to ensure low CPU overhead by directly moving data from the wire to database buffers with no extra copies. The RDMA Network Fabric has the flexibility of a LAN network with the efficiency of a storage area network (SAN). With an RDMA Network Fabric network, Oracle Exadata Database Machine eliminates network bottlenecks that could reduce performance. This RDMA Network Fabric network also provides a high performance cluster interconnection for Oracle Real Application Clusters (Oracle RAC) servers.

Each storage server has disk (rotating) storage. There are several configurations of storage server each configured to maximize some aspect of storage based on your requirements. You can specify storage servers with either Exadata Smart Flash Cache installed or Exadata Smart Flash Cache available as an option. Some storage servers also have Persistant Memory (PMEM) cache. Each of the cache configuration use RDMA providing maximum transfer rates.

The Oracle Exadata Database Machine architecture scales to any level of performance. To achieve higher performance or greater storage capacity, you add more storage servers (cells) to the configuration. As more storage servers are added, capacity and performance increase linearly. Data is mirrored across storage servers to ensure that the failure of a storage server does not cause loss of data or availability. The scale-out architecture achieves near infinite scalability, while lowering costs by allowing storage to be purchased incrementally on demand.

Note:

Oracle Exadata System Software must be used with Oracle Exadata Database Machine storage server hardware, and only supports Oracle databases on the database servers of Oracle Exadata Database Machines. Information is available on My Oracle Support at

http://support.oracle.com

and on the Products page of Oracle Technology Network at

http://www.oracle.com/technetwork/index.html

1.2 Key Features of Oracle Exadata System Software

This section describes the key features of Oracle Exadata System Software.

1.2.1 Reliability, Modularity, and Cost-Effectiveness

Oracle Exadata System Software enables cost-effective modular storage hardware to be used in a scale-out architecture while providing a high level of availability and reliability.

All single points of failure are eliminated in the Oracle Exadata Storage Server architecture by data mirroring, fault isolation technology, and protection against disk and other storage hardware failure. Even brownouts are limited or eliminated when failures occur.

In the Oracle Exadata Storage Server architecture, one or more storage cells can support one or more databases. The placement of data is transparent to database users and applications. Storage cells use Oracle Automatic Storage Management (Oracle ASM) to distribute data evenly across the cells. Because Oracle Exadata Storage Servers support dynamic disk insertion and removal, the online dynamic data redistribution feature of Oracle ASM ensures that data is appropriately balanced across the newly added, or remaining, disks without interrupting database processing. Oracle Exadata Storage Servers also provides data protection from disk and cell failures.

1.2.2 Compatibility with Oracle Database

When the minimum required versions are met, all Oracle Database features are fully supported with Oracle Exadata System Software.

Oracle Exadata System Software works equally well with single-instance or Oracle Real Application Clusters (Oracle RAC) deployments of Oracle Database. Oracle Data Guard, Oracle Recovery Manager (RMAN), Oracle GoldenGate, and other database features are managed the same with Exadata storage cells as with traditional storage. This enables database administrators to use the same tools with which they are familiar.

Refer to My Oracle Support Doc ID 888828.1 for a complete list of the minimum required software versions.

1.2.3 Smart Flash Technology

The Exadata Smart Flash Cache feature of the Oracle Exadata System Software intelligently caches database objects in flash memory, replacing slow, mechanical I/O operations to disk with very rapid flash memory operations.

1.2.3.1 Exadata Smart Flash Cache

Exadata Smart Flash Cache holds frequently accessed data in high-performance flash storage, while most data is kept in very cost-effective disk storage.

Caching occurs automatically and requires no user or administrator effort.

Exadata Smart Flash Cache intelligently determines the data that is most useful to cache based on data usage, access patterns, and hints from the database that indicate the type of data being accessed. It also avoids caching data that will never be reused or will not fit into the cache.

Although it is generally not required or recommended, Oracle Exadata System Software also enables administrators to override the default caching policy and keep specific table and index segments in or out of the cache.

Originally, Exadata Smart Flash Cache operated exclusively in Write-Through mode. In Write-Through mode, database writes go to disk first, and subsequently populate Flash Cache. If a flash device fails with Exadata Smart Flash Cache operating in Write-Through mode, there is no data loss because the data is already on disk.

1.2.3.2 Write-Back Flash Cache

Write-Back Flash Cache enables write I/Os directly to Exadata Smart Flash Cache.

With Exadata Smart Flash Cache in Write-Back mode, database writes go to Flash Cache first and later to disk. Write-Back mode was introduced with Oracle Exadata System Software release 11.2.3.2.0.

Write-intensive applications can benefit significantly from Write-Back mode by taking advantage of the fast latencies provided by flash. If your application writes intensively and if you experience high I/O latency or significant waits for free buffer waits, then you should consider using Write-Back Flash Cache.

With Exadata Smart Flash Cache in Write-Back mode, the total amount of disk I/O also reduces when the cache absorbs multiple writes to the same block before writing it to disk. The saved I/O bandwidth can be used to increase the application throughput or service other workloads.

However, if a flash device fails while using Write-Back mode, data that is not yet persistent to disk is lost and must be recovered from a mirror copy. For this reason, Write-Back mode is recommended in conjunction with high redundancy ASM disk groups.

The contents of the Write-Back Flash Cache is persisted across reboots, eliminating any warm-up time needed to populate the cache.

1.2.3.3 Exadata Smart Flash Log

Exadata Smart Flash Log improves transaction response times and increases overall database throughput for I/O intensive workloads by accelerating performance-critical log write operations.

The time to commit user transactions is very sensitive to the latency of log write operations. In addition, many performance-critical database algorithms, such as space management and index splits, are very sensitive to log write latency.

Although the disk controller has a large battery-backed DRAM cache that can accept writes very quickly, some write operations to disk can still be slow during busy periods when the disk controller cache is occasionally filled with blocks that have not been written to disk. Noticeable performance issues can arise even with relatively few slow redo log write operations.

Exadata Smart Flash Log reduces the average latency for performance-sensitive redo log write I/O operations, thereby eliminating performance bottlenecks that may occur due to slow redo log writes. Exadata Smart Flash Log removes latency spikes by simultaneously performing redo log writes to two media devices. The redo write is acknowledged as soon as the first write to either media device completes.

Prior to Oracle Exadata System Software release 20.1, Exadata Smart Flash Log performs simultaneous writes to disk and flash storage. With this configuration, Exadata Smart Flash Log improves average log write latency and increases overall database throughput. But, because all log writes must eventually persist to disk, this configuration is limited by the overall disk throughput, and provides little relief for applications that are constrained by disk throughput.

Oracle Exadata System Software release 20.1 adds a further optimization, known as Smart Flash Log Write-Back, that uses Exadata Smart Flash Cache in Write-Back mode instead of disk storage. With this configuration, Exadata Smart Flash Log improves average log write latency and overall log write throughput to eliminate logging bottlenecks for demanding throughput-intensive applications.

1.2.4 Persistent Memory Accelerator and RDMA

Persistent Memory (PMEM) Accelerator provides direct access to persistent memory using Remote Direct Memory Access (RDMA), enabling faster response times and lower read latencies.

Starting with Oracle Exadata System Software release 19.3.0, workloads that require ultra low response time, such as stock trades and IOT devices, can take advantage of PMEM and RDMA in the form of a PMEM Cache and PMEM Log.

PMEM is a new storage tier that was first released on Exadata X8M. When database clients read from the PMEM cache, the client software performs an RDMA read of the cached data, which bypasses the storage server software and delivers results much faster than Exadata Smart Flash Cache.

PMEM cache works in conjunction with Exadata Smart Flash Cache. The following table describes the supported caching mode combinations when PMEM Cache is configured:

PMEM Cache Mode Flash Cache Mode Supported Configuration?
Write-Through Write-Through Yes. This is the default configuration for High Capacity (HC) servers with Normal Redundancy.
Write-Through Write-Back Yes. This is the default configuration for HC servers with High Redundancy. This is also the default configuration for Extreme Flash (EF) servers.
Write-Back Write-Back Yes.
Write-Back Write-Through No. Without the backing of Write-Back Flash Cache, write-intensive workloads can overload the PMEM Cache in Write-Back mode.

If PMEM Cache is not configured, Exadata Smart Flash Cache is supported in both Write-Back and Write-Through modes.

Redo log writes are critical database operations and need to complete in a timely manner to prevent load spikes or stalls. Exadata Smart Flash Log is designed to prevent redo write latency outliers. PMEM Log helps to further reduce redo log write latency by using PMEM and RDMA.

With PMEM Log, database clients send redo log I/O buffers directly to PMEM on the storage servers using RDMA, thereby reducing transport latency. The cell server (cellsrv) then writes the redo to Exadata Smart Flash Log (if enabled) and disk at a later time.

Reduced redo log write latency improves OLTP performance, resulting in higher transaction throughput. In cases where PMEM Log is bypassed, Exadata Smart Flash Log can still be used.

1.2.5 Centralized Storage

You can use Oracle Exadata Storage Server to consolidate your storage requirements into a central pool that can be used by multiple databases.

Oracle Exadata System Software with Oracle Automatic Storage Management (Oracle ASM) evenly distributes the data and I/O load for every database across available disks in the storage pool. Every database can use all of the available disks to achieve superior I/O rates. Oracle Exadata Storage Servers can provide higher efficiency and performance at a lower cost while also lowering your storage administration overhead.

1.2.6 I/O Resource Management (IORM)

I/O Resource Management (IORM) and the Oracle Database Resource Manager enable multiple databases and pluggable databases to share the same storage while ensuring that I/O resources are allocated across the various databases.

Oracle Exadata System Software works with IORM and Oracle Database Resource Manager to ensure that customer-defined policies are met, even when multiple databases share the same set of storage servers. As a result, one database cannot monopolize the I/O bandwidth and degrade the performance of the other databases.

IORM enables storage cells to service I/O resources among multiple applications and users across all databases in accordance with sharing and prioritization levels established by the administrator. This improves the coexistence of online transaction processing (OLTP) and reporting workloads, because latency-sensitive OLTP applications can be given a larger share of disk and flash I/O bandwidth than throughput-sensitive batch applications. Oracle Database Resource Manager enables the administrator to control processor utilization on the database host on a per-application basis. Combining IORM and Oracle Database Resource Manager enables the administrator to establish more accurate policies.

IORM also manages the space utilization for Exadata Smart Flash Cache and PMEM cache. Critical OLTP workloads can be guaranteed space in Exadata Smart Flash Cache or PMEM cache to provide consistent performance.

IORM for a database or pluggable database (PDB) is implemented and managed from the Oracle Database Resource Manager. Oracle Database Resource Manager in the database instance communicates with the IORM software in the storage cell to manage user-defined service-level targets. Database resource plans are administered from the database, while interdatabase plans are administered on the storage cell.

Related Topics

1.2.7 Exadata Hybrid Columnar Compression

Exadata Hybrid Columnar Compression stores data using column organization, which brings similar values close together and enhances compression ratios.

Using Exadata Hybrid Columnar Compression, data is organized into sets of rows called compression units. Within a compression unit, data is organized by column and then compressed. Each row is self-contained within a compression unit.

Database operations work transparently against compressed objects, so no application changes are required. The database compresses data manipulated by any SQL operation, although compression levels are higher for direct path loads.

You can specify the following types of Exadata Hybrid Columnar Compression, depending on your requirements:

  • Warehouse compression: This type of compression is optimized for query performance, and is intended for data warehouse applications.
  • Archive compression: This type of compression is optimized for maximum compression levels, and is intended for historic data and data that does not change.

Assume that you apply Exadata Hybrid Columnar Compression to a daily_sales table. At the end of every day, the table is populated with items and the number sold, with the item ID and date forming a composite primary key. A row subset is shown in the following table.

Table 1-1 Sample Table daily_sales

Item_ID Date Num_Sold Shipped_From Restock

1000

01-JUN-07

2

WAREHOUSE1

Y

1001

01-JUN-07

0

WAREHOUSE3

N

1002

01-JUN-07

1

WAREHOUSE3

N

1003

01-JUN-07

0

WAREHOUSE2

N

1004

01-JUN-07

2

WAREHOUSE1

N

1005

01-JUN-07

1

WAREHOUSE2

N

The database stores a set of rows in an internal structure called a compression unit. For example, assume that the rows in the previous table are stored in one unit. Exadata Hybrid Columnar Compression stores each unique value from column 4 with metadata that maps the values to the rows. Conceptually, the compressed value can be represented as:

WAREHOUSE1WAREHOUSE3WAREHOUSE2

The database then compresses the repeated word WAREHOUSE in this value by storing it once and replacing each occurrence with a reference. If the reference is smaller than the original word, then the database achieves compression. The compression benefit is particularly evident for the Date column, which contains only one unique value.

As shown in the following illustration, each compression unit can span multiple data blocks. The values for a particular column may or may not span multiple blocks.

Exadata Hybrid Columnar Compression has implications for row locking. When an update occurs for a row in an uncompressed data block, only the updated row is locked. In contrast, the database must lock all rows in the compression unit if an update is made to any row in the unit. Updates to rows using Exadata Hybrid Columnar Compression cause rowids to change.

Note:

When tables use Exadata Hybrid Columnar Compression, Oracle DML locks larger blocks of data (compression units) which may reduce concurrency.

Oracle Database supports four methods of table compression.

Table 1-2 Table Compression Methods

Table Compression Method Compression Level CPU Overhead Applications

Basic compression

High

Minimal

DSS

OLTP compression

High

Minimal

OLTP, DSS

Warehouse compression

Higher (compression level depends on compression level specified (LOW or HIGH))

Higher (CPU overhead depends on compression level specified (LOW or HIGH))

DSS

Archive compression

Highest (compression level depends on compression level specified (LOW or HIGH))

Highest (CPU overhead depends on compression level specified (LOW or HIGH))

Archiving

Warehouse compression and archive compression achieve the highest compression levels because they use Exadata Hybrid Columnar Compression technology. Exadata Hybrid Columnar Compression technology uses a modified form of columnar storage instead of row-major storage. This enables the database to store similar data together, which improves the effectiveness of compression algorithms. Because Exadata Hybrid Columnar Compression requires high CPU overhead for DML, use it only for data that is updated infrequently.

The higher compression levels of Exadata Hybrid Columnar Compression are achieved only with data that is direct-path inserted. Conventional inserts and updates are supported, but result in a less compressed format, and reduced compression level.

The following table lists characteristics of each table compression method.

Table 1-3 Table Compression Characteristics

Table Compression Method CREATE/ALTER TABLE Syntax Direct-Path Insert DML

Basic compression

COMPRESS [BASIC]

COMPRESS and COMPRESS BASIC are equivalent

Yes

Yes

Note: Inserted and updated rows are uncompressed.

OLTP compression

COMPRESS FOR OLTP

Yes

Yes

Warehouse compression

COMPRESS FOR QUERY [LOW|HIGH]

Yes

Yes

High CPU overhead.

Note: Inserted and updated rows go to a block with a less compressed format and have lower compression level.

Archive compression

COMPRESS FOR ARCHIVE [LOW|HIGH]

Yes

Yes

Note: Inserted and updated rows are uncompressed. Inserted and updated rows go to a block with a less compressed format and have lower compression level.

The COMPRESS FOR QUERY HIGH option is the default data warehouse compression mode. It provides good compression and performance. The COMPRESS FOR QUERY LOW option should be used in environments where load performance is critical. It loads faster than data compressed with the COMPRESS FOR QUERY HIGH option.

The COMPRESS FOR ARCHIVE LOW option is the default archive compression mode. It provides a high compression level and good query performance. It is ideal for infrequently-accessed data. The COMPRESS FOR ARCHIVE HIGH option should be used for data that is rarely accessed.

A compression advisor, provided by the DBMS_COMPRESSION package, helps you determine the expected compression level for a particular table with a particular compression method.

You specify table compression with the COMPRESS clause of the CREATE TABLE command. You can enable compression for an existing table by using these clauses in an ALTER TABLE statement. In this case, only data that is inserted or updated is compressed after compression is enabled. Similarly, you can disable table compression for an existing compressed table with the ALTER TABLE...NOCOMPRESS command. In this case, all data that was already compressed remains compressed, and new data is inserted uncompressed.

1.2.8 In-Memory Columnar Format Support

In an Oracle Exadata Database Machine environment, the data is automatically stored in In-Memory columnar format in the flash cache when it will improve performance.

Oracle Exadata Database Machine supports all of the In-Memory optimizations, such as accessing only the compressed columns required, SIMD vector processing, storage indexes, and so on.

If you set the INMEMORY_SIZE database initialization parameter to a non-zero value (requires the Oracle Database In-Memory option), then objects accessed using a Smart Scan are brought into the flash cache and are automatically converted into the In-Memory columnar format. The data is converted initially into a columnar cache format, which is different from Oracle Database In-Memory’s columnar format. The data is rewritten in the background into Oracle Database In-Memory columnar format. As a result, all subsequent accesses to the data benefit from all of the In-Memory optimizations when that data is retrieved from the flash cache.



Any write to an in-memory table does not invalidate the entire columnar cache of that table. It only invalidates the columnar cache unit of the disk region in which the block resides. For subsequent scans after a table update, a large part of the table is still in the columnar cache. The scans can still make use of the columnar cache, except for the units in which the writes were made. For those units, the query uses the original block version to get the data. After a sufficient number of scans, the invalidated columnar cache units are automatically repopulated in the columnar format.

A new segment-level attribute, CELLMEMORY, has also been introduced to help control which objects should not be populated into flash using the In-Memory columnar format and which type of compression should be used. Just like the INMEMORY attribute, you can specify different compression levels as sub-clauses to the CELLMEMORY attribute. However, not all of the INMEMORY compression levels are available; only MEMCOMPRESS FOR QUERY LOW and MEMCOMPRESS FOR CAPACITY LOW (default). You specify the CELLMEMORY attribute using a SQL command, such as the following:

ALTER TABLE trades CELLMEMORY MEMCOMPRESS FOR QUERY LOW

The PRIORTY sub-clause available with Oracle Database In-Memory is not available on Oracle Exadata Database Machine because the process of populating the flash cache on Exadata storage servers if different from populating DRAM in the In-Memory column store on Oracle Database servers.

1.2.9 Offloading of Data Search and Retrieval Processing

One of the most powerful features of Oracle Exadata System Software is that it offloads the data search and retrieval processing to the storage servers.

One of the main advantages of Exadata Smart Scan Offload, or simply Smart Scan is that it uses storage server CPU, freeing database server from the IO and includes decompression of data stored in compressed format. Smart Scan further reduces the processing on the database server by performing predicate filtering, which entails evaluating database predicates to optimize the performance of certain classes of bulk data processing.

Oracle Database can optimize the performance of queries that perform full table, fast full index, and full bitmap index scans to evaluate selective predicates in Oracle Exadata Storage Server. The database can complete these queries faster by pushing the database expression evaluations to the storage cell. These expressions include simple SQL command predicates, such as amount > 200, and column projections, such as SELECT customer_name. For example:

SQL> SELECT customer_name FROM calls WHERE amount > 200;

In the preceding example, only rows satisfying the predicate, and specified columns are returned to the database server, eliminating unproductive data transfer to the database server.

Oracle Exadata System Software uses storage-side predicate evaluation that transfers simplified, predicate evaluation operations for table and index scans to the storage cell. This brings the table scan closer to the disk to enable a higher bandwidth, and prevents sending unmatched rows to hosts.

Figure 1-2 Offloading Data Search and Retrieval

Description of Figure 1-2 follows
Description of "Figure 1-2 Offloading Data Search and Retrieval"

1.2.10 Offloading of Incremental Backup Processing

To optimize the performance of incremental backups, the database can offload block filtering to Oracle Exadata Storage Server.

This optimization is only possible when taking backups using Oracle Recovery Manager (RMAN). The offload processing is done transparently without user intervention. During offload processing, Oracle Exadata System Software filters out the blocks that are not required for the incremental backup in progress. Therefore, only the blocks that are required for the backup are sent to the database, making backups significantly faster.

1.2.11 Fault Isolation with Quarantine

Oracle Exadata System Software has the ability to learn from the past events to avoid a potential fatal error.

When a faulty SQL statement caused a crash of the server in the past, Oracle Exadata System Software quarantines the SQL statement so that when the faulty SQL statement occurs again, Oracle Exadata System Software does not allow the SQL statement to perform Smart Scan. This reduces the chance of server software crashes, and improves storage availability. The following types of quarantine are available:

  • SQL Plan: Created when Oracle Exadata System Software crashes while performing Smart Scan for a SQL statement. As a result, the SQL Plan for the SQL statement is quarantined, and Smart Scan is disabled for the SQL plan.

  • Disk Region: Created when Oracle Exadata System Software crashes while performing Smart Scan of a disk region. As a result, the 1 MB disk region being scanned is quarantined and Smart Scan is disabled for the disk region.

  • Database: Created when Oracle Exadata System Software detects that a particular database causes instability to a cell. Instability detection is based on the number of SQL Plan Quarantines for a database. Smart Scan is disabled for the database.

  • Cell Offload: Created when Oracle Exadata System Software detects some offload feature has caused instability to a cell. Instability detection is based on the number of Database Quarantines for a cell. Smart Scan is disabled for all databases.

  • Intra-Database Plan: Created when Oracle Exadata System Software crashes while processing an intra-database resource plan. Consequently, the intra-database resource plan is quarantined and not enforced. Other intra-database resource plans in the same database are still enforced. Intra-database resource plans in other databases are not affected.

  • Inter-Database Plan: Created when Oracle Exadata System Software crashes while processing an inter-database resource plan. Consequently, the inter-database resource plan is quarantined and not enforced. Other inter-database resource plans are still enforced.

  • I/O Resource Management (IORM): Created when Oracle Exadata System Software crashes in the I/O processing code path. IORM is effectively disabled by setting the IORM objective to basic and all resource plans are ignored.

  • Cell-to-Cell Offload: See "Quarantine Manager Support for Cell-to-Cell Offload Operations".

When a quarantine is created, alerts notify administrators of what was quarantined, why the quarantine was created, when and how the quarantine can be dropped manually, and when the quarantine is dropped automatically. All quarantines are automatically removed when a cell is patched or upgraded.

CellCLI commands are used to manually manipulate quarantines. For instance, the administrator can manually create a quarantine, drop a quarantine, change attributes of a quarantine, and list quarantines.

1.2.11.1 Quarantine Manager Support for Cell-to-Cell Offload Operations

Minimum Exadata software required: 12.2.1.1.0

Quarantine manager support is enabled for rebalance and high throughput writes in cell-to-cell offload operations. If Exadata detects a crash during these operations, the offending operation will be quarantined, and Exadata will fall back to using non-offloaded operations.

These types of quarantines are most likely caused by incompatible versions of CELLSRV. If such quarantines occur on your system, contact Oracle Support Services.

For rebalance operations, the quarantine is based on the ASM cluster ID. Rebalance will continue using the fallback path, which is slower.

For high throughput writes that originated from a database, the quarantine is based on a combination of ASM cluster ID and database ID.

For high throughput writes that originated from a CDB or PDB, the quarantine is based on a combination of ASM cluster ID and container database ID.

To identify these types of quarantine, run the LIST QUARANTINE DETAIL command and check the value of the quarantineType attribute. Values for this attribute for these quarantines are ASM_OFFLOAD_REBALANCE and HIGH_THROUGHPUT_WRITE. For the HIGH_THROUGHPUT_WRITE type there is a database case and a CDB case.

The LIST QUARANTINE statement produces output that looks like the following:

For rebalance:

CellCLI> list quarantine detail
 name:                   2
 asmClusterId:           b6063030c0ffef8dffcc99bd18b91a62
 cellsrvChecksum:        9f98483ef351a1352d567ebb1ca8aeab
 clientPID:              10308
 comment:                None
 crashReason:            ORA-600[CacheGet::process:C2C_OFFLOAD_CACHEGET_CRASH]
 creationTime:           2016-06-23T22:33:30-07:00
 dbUniqueID:             0
 dbUniqueName:           UnknownDBName
 incidentID:             1
 quarantineMode:         "FULL Quarantine"
 quarantinePlan:         SYSTEM
 quarantineReason:       Crash
 quarantineType:         ASM_OFFLOAD_REBALANCE
 remoteHostName:         slc10vwt
 rpmVersion:             OSS_MAIN_LINUX.X64_160623

For high throughput writes that originated from database:

CellCLI> list quarantine detail
 name:                   10
 asmClusterId:           b6063030c0ffef8dffcc99bd18b91a62
 cellsrvChecksum:        9f98483ef351a1352d567ebb1ca8aeab
 clientPID:              8377
 comment:                None
 crashReason:            ORA-600[CacheGet::process:C2C_OFFLOAD_CACHEGET_CRASH]
 creationTime:           2016-06-23T23:47:01-07:00
 conDbUniqueID:          0
 conDbUniqueName:        UnknownDBName
 dbUniqueID:             4263312973
 dbUniqueName:           WRITES
 incidentID:             25
 quarantineMode:         "FULL Quarantine"
 quarantinePlan:         SYSTEM
 quarantineReason:       Crash
 quarantineType:         HIGH_THROUGHPUT_WRITE
 remoteHostName:         slc10vwt
 rpmVersion:             OSS_MAIN_LINUX.X64_160623

For high throughput writes that originated from the CDB (differences noted in bold):

CellCLI> list quarantine detail
 name:                   10
 asmClusterId:           eff096e82317ff87bfb2ee163731f7f7
 cellsrvChecksum:        9f98483ef351a1352d567ebb1ca8aeab
 clientPID:              17206
 comment:                None
 crashReason:            ORA-600[CacheGet::process:C2C_OFFLOAD_CACHEGET_CRASH]
 creationTime:           2016-06-24T12:59:06-07:00
 conDbUniqueID:          4263312973 
 conDbUniqueName:        WRITES 
 dbUniqueID:             0 
 dbUniqueName:           UnknownDBName 
 incidentID:             25
 quarantineMode:         "FULL Quarantine"
 quarantinePlan:         SYSTEM
 quarantineReason:       Crash
 quarantineType:         HIGH_THROUGHPUT_WRITE
 remoteHostName:         slc10vwt
 rpmVersion:             OSS_MAIN_LINUX.X64_160623

1.2.12 Protection Against Data Corruption

Data corruptions, while rare, can have a catastrophic effect on a database, and therefore on a business.

Oracle Exadata System Software takes data protection to the next level by protecting business data, not just the physical bits.

The key approach to detecting and preventing corrupted data is block checking in which the storage subsystem validates the Oracle block contents. Oracle Database validates and adds protection information to the database blocks, while Oracle Exadata System Software detects corruptions introduced into the I/O path between the database and storage. The Storage Server stops corrupted data from being written to disk. This eliminates a large class of failures that the database industry had previously been unable to prevent.

Unlike other implementations of corruption checking, checks with Oracle Exadata System Software operate completely transparently. No parameters need to be set at the database or storage tier. These checks transparently handle all cases, including Oracle Automatic Storage Management (Oracle ASM) disk rebalance operations and disk failures.

1.2.13 Fast File Creation

File creation operations are offloaded to Oracle Exadata Storage Servers.

Operations such as CREATE TABLESPACE, which can create one or more files, have a significant increase in speed due to file creation offload.

File resize operations are also offloaded to the storage servers, which are important for auto-extensible files.

1.2.14 Storage Index

Oracle Exadata Storage Servers maintain a storage index which contains a summary of the data distribution on the disk.

The storage index is maintained automatically, and is transparent to Oracle Database. It is a collection of in-memory region indexes, prior to Exadata 12.2.1.1.0 each region index stores summaries for up to eight columns, and from Exadata 12.2.1.1.0, each region index may store summaries for up to 24 columns. If set summaries are used, the maximum number of 24 may not be achieved. There is one region index for each 1 MB of disk space. Storage indexes work with any non-linguistic data type, and work with linguistic data types similar to non-linguistic indexes.

Each region index maintains the minimum and maximum values of the columns of the table. The minimum and maximum values are used to eliminate unnecessary I/O, also known as I/O filtering. The Cell physical IO bytes saved by storage index statistic, available in the V$SYS_STAT and V$SESSTAT views, shows the number of bytes of I/O saved using storage index. The content stored in one region index is independent of the other region indexes. This makes them highly scalable, and avoids latch contention.

Queries using the following comparisons are improved by the storage index:

  • Equality (=)

  • Inequality (<, !=, or >)

  • Less than or equal (<=)

  • Greater than or equal (>=)

  • IS NULL

  • IS NOT NULL

Oracle Exadata System Software automatically builds Storage indexes after a query with a comparison predicate that is greater than the maximum or less than the minimum value for the column in a region, and would have benefited if a storage index had been present. Oracle Exadata System Software automatically learns which storage indexes would have benefited a query, and then creates the storage index automatically so that subsequent similar queries benefit.

In Oracle Exadata System Software release 12.2.1.1.0 and later, when data has been stored using the in-memory format columnar cache, Oracle Exadata Database Machine stores these columns compressed using dictionary encoding. For columns with fewer than 200 distinct values, the storage index creates a very compact in-memory representation of the dictionary and uses this compact representation to filter disk reads based on equality predicates. This feature is called set membership. A more limited filtering ability extends up to 400 distinct values.

For example, suppose a region of disk holds a list of customers in the United States and Canada. When you run a query looking for customers in Mexico, Oracle Exadata Storage Server can use the new set membership capability to improve the performance of the query by filtering out disk regions that do not contain customers from Mexico. In Oracle Exadata System Software releases earlier than 12.2.1.1.0, which do not have the set membership capability, a regular storage index would be unable to filter those disk regions.

Note:

The effectiveness of storage indexes can be improved by ordering the rows based on columns that frequently appear in WHERE query clauses.

Note:

The storage index is maintained during write operations to uncompressed blocks and OLTP compressed blocks. Write operations to Exadata Hybrid Columnar Compression compressed blocks or encrypted tablespaces invalidate a region index, and only the storage index on a specific region. The storage index for Exadata Hybrid Columnar Compression is rebuilt on subsequent scans.

Example 1-1 Elimination of Disk I/O with Storage Index

The following figure shows a table and region indexes. The values in the table range from one to eight. One region index stores the minimum 1, and the maximum of 5. The other region index stores the minimum of 3, and the maximum of 8.

For a query such as SELECT * FROM TABLE WHERE B<2, only the first set of rows match. Disk I/O is eliminated because the minimum and maximum of the second set of rows do not match the WHERE clause of the query.

Example 1-2 Partition Pruning-like Benefits with Storage Index

In the following figure, there is a table named Orders with the columns Order_Number, Order_Date, Ship_Date, and Order_Item. The table is range partitioned by Order_Date column.

The following query looks for orders placed since January 1, 2015:

SELECT count (*) FROM Orders WHERE Order_Date >= to_date ('2015-01-01', \
'YYY-MM-DD')

Because the table is partitioned on the Order_Date column, the preceding query avoids scanning unnecessary partitions of the table. Queries on Ship_Date do not benefit from Order_Date partitioning, but Ship_Date and Order_Number are highly correlated with Order_Date. Storage indexes take advantage of ordering created by partitioning or sorted loading, and can use it with the other columns in the table. This provides partition pruning-like performance for queries on the Ship_Date and Order_Number columns.

Example 1-3 Improved Join Performance Using Storage Index

Using storage index allows table joins to skip unnecessary I/O operations. For example, the following query would perform an I/O operation and apply a Bloom filter to only the first block of the fact table.

SELECT count(*) FROM fact, dim WHERE fact.m=dim.m AND dim.product="Hard drive"

The I/O for the second block of the fact table is completely eliminated by storage index as its minimum/maximum range (5,8) is not present in the Bloom filter.

1.3 Oracle Exadata System Software Components

This section provides a summary of the following Oracle Exadata System Software components.

1.3.1 About Oracle Exadata System Software

Unique software algorithms in Oracle Exadata System Software implement database intelligence in storage, PCI-based flash, and RDMA Network Fabric networking to deliver higher performance and capacity at lower costs than other platforms.

Oracle Exadata Storage Server is a network-accessible storage device with Oracle Exadata System Software installed on it. The software communicates with the database using a specialized iDB protocol, and provides both simple I/O functionality, such as block-oriented reads and writes, and advanced I/O functionality, including predicate offload and I/O Resource Management (IORM). Each storage server has physical disks. The physical disk is an actual device within the storage server that constitutes a single disk drive spindle.

Within the storage servers, a logical unit number (LUN) defines a logical storage resource from which a single cell disk can be created. The LUN refers to the access point for storage resources presented by the underlying hardware to the upper software layers. The precise attributes of a LUN are configuration-specific. For example, a LUN could be striped, mirrored, or both striped and mirrored.

A cell disk is an Oracle Exadata System Software abstraction built on the top of a LUN. After a cell disk is created from the LUN, it is managed by Oracle Exadata System Software and can be further subdivided into grid disks, which are directly exposed to the database and Oracle Automatic Storage Management (Oracle ASM) instances. Each grid disk is a potentially non-contiguous partition of the cell disk that is directly exposed to Oracle ASM to be used for the Oracle ASM disk group creations and expansions.

This level of virtualization enables multiple Oracle ASM clusters and multiple databases to share the same physical disk. This sharing provides optimal use of disk capacity and bandwidth. Various metrics and statistics collected on the cell disk level enable you to evaluate the performance and capacity of storage servers. IORM schedules the cell disk access in accordance with user-defined policies.

The following image illustrates how the components of a storage server (also called a cell) are related to grid disks.

  • A LUN is created from a physical disk.
  • A cell disk is created on a LUN. A segment of cell disk storage is used by the Oracle Exadata System Software system, referred to as the cell system area.
  • Multiple grid disks can be created on a cell disk.

Figure 1-3 Oracle Exadata Storage Server Components

Description of Figure 1-3 follows
Description of "Figure 1-3 Oracle Exadata Storage Server Components"

The following image illustrates software components in the Oracle Exadata Storage Server environment.

Figure 1-4 Software Components in the Oracle Exadata Database Machine Environment

Description of Figure 1-4 follows
Description of "Figure 1-4 Software Components in the Oracle Exadata Database Machine Environment"

The figure illustrates the following environment:

  • Single-instance or Oracle RAC databases access storage servers using the iDB protocol over a RDMA Network Fabric network. Each database server runs the Oracle Database and Oracle Grid Infrastructure software. Resources are managed for each database instance by Oracle Database Resource Manager (shown as DBRM).

  • The database servers include Oracle Exadata System Software functionality, such as a Management Server (MS), command-line interface (DBMCLI) and OS based ExaWatcher.

  • Storage servers contain cell-based utilities and processes from Oracle Exadata System Software, including:

    • Cell Server (CELLSRV)—the primary component of the Oracle Exadata System Software running in the storage server, which provides the majority of the storage server services. CELLSRV services database requests for disk I/O and provides the advanced SQL offload capabilities. CELLSRV implements the I/O Resource Management (IORM) functionality to meter out I/O bandwidth to the various databases and consumer groups issuing I/O calls on the storage server.

    • Offload Server (CELLOFLSRV <version> — Is a helper process to the Cell Server that processes offload requests from a specific Database version. These processes allow the Storage server to respond to requests from multiple database versions residing on the same or multiple Database servers.

    • Management Server (MS)—the primary interface to administer, manage and query the status of the storage server. It works in cooperation with the Cell Control Command-Line Interface (CellCLI) and processes most of the commands from CellCLI.

    • Restart Server (RS)—monitors the heartbeat with the MS and the CELLSRV processes, and restarts the servers if they fail to respond within the allowable heartbeat period.

  • Storage cells are configured on the network, and are managed by the Oracle Exadata System Software CellCLI utility.

  • Each storage server contains multiple disks which store the data for the database instances on the database servers. The data is stored in disks managed by Oracle ASM.

1.3.2 About Oracle Automatic Storage Management

Oracle Automatic Storage Management (Oracle ASM) is the cluster volume manager and file system used to manage Oracle Exadata Storage Server resources.

Oracle ASM provides enhanced storage management by:

  • Striping database files evenly across all available storage cells and disks for optimal performance.
  • Using mirroring and failure groups to avoid any single point of failure.
  • Enabling dynamic add and drop capability for non-intrusive cell and disk allocation, deallocation, and reallocation.
  • Enabling multiple databases to share storage cells and disks.

The following topics provide a brief overview of Oracle ASM:

1.3.2.1 Oracle ASM Disk Groups

An Oracle Automatic Storage Management (Oracle ASM) disk group is the primary storage abstraction within Oracle ASM, and is composed of one or more grid disks.

Oracle Exadata Storage Server grid disks appear to Oracle ASM as individual disks available for membership in Oracle ASM disk groups. Whenever possible, grid disk names should correspond closely with Oracle ASM disk group names to assist in problem diagnosis between Oracle ASM and Oracle Exadata System Software.

The Oracle ASM disk groups are as follows:

  • DATA is the data disk group.

  • RECO is the recovery disk group.

  • DBFS (Oracle Database File System) is the file system disk group.

  • SPARSE is a sparse disk group to keep snapshot files.

To take advantage of Oracle Exadata System Software features, such as predicate processing offload, the disk groups must contain only Oracle Exadata Storage Server grid disks, the tables must be fully inside these disk groups, and the group should have cell.smart_scan_capable attribute set to TRUE.

Note:

The Oracle Database and Oracle Grid Infrastructure software must be release 12.1.0.2.0 BP3 or later when using sparse grid disks.

1.3.2.2 Oracle ASM Failure Group

An Oracle ASM failure group is a subset of disks in an Oracle ASM disk group that can fail together because they share the same hardware.

Oracle ASM considers failure groups when making redundancy decisions.

For Oracle Exadata Storage Servers, all grid disks, which consist of the Oracle ASM disk group members and candidates, can effectively fail together if the storage cell fails. Because of this scenario, all Oracle ASM grid disks sourced from a given storage cell should be assigned to a single failure group representing the cell.

For example, if all grid disks from two storage cells, A and B, are added to a single Oracle ASM disk group with normal redundancy, then all grid disks on storage cell A are designated as one failure group, and all grid disks on storage cell B are designated as another failure group. This enables Oracle Exadata System Software and Oracle ASM to tolerate the failure of either storage cell.

Failure groups for Oracle Exadata Storage Server grid disks are set by default so that the disks on a single cell are in the same failure group, making correct failure group configuration simple for Oracle Exadata Storage Servers.

You can define the redundancy level for an Oracle ASM disk group when creating a disk group. An Oracle ASM disk group can be specified with normal or high redundancy. Normal redundancy double mirrors the extents, and high redundancy triple mirrors the extents. Oracle ASM normal redundancy tolerates the failure of a single cell or any set of disks in a single cell. Oracle ASM high redundancy tolerates the failure of two cells or any set of disks in two cells. Base your redundancy setting on your desired protection level. When choosing the redundancy level, ensure the post-failure I/O capacity is sufficient to meet the redundancy requirements and performance service levels. Oracle recommends using three cells for normal redundancy. This ensures the ability to restore full redundancy after cell failure. Consider the following:

  • If a cell or disk fails, then Oracle ASM automatically redistributes the cell or disk contents across the remaining disks in the disk group as long as there is enough space to hold the data. For an existing disk group using Oracle ASM redundancy, the USABLE_FILE_MB and REQUIRED_FREE_MIRROR_MB columns in the V$ASM_DISGKROUP view give the amount of usable space and space for redundancy, respectively.

  • If a cell or disk fails, then the remaining disks should be able to generate the IOPS necessary to sustain the performance service level agreement.

After a disk group is created, the redundancy level of the disk group cannot be changed. To change the redundancy of a disk group, you must create another disk group with the appropriate redundancy, and then move the files.

Each Exadata Cell is a failure group. A normal redundancy disk group must contain at least two failure groups. Oracle ASM automatically stores two copies of the file extents, with the mirrored extents placed in different failure groups. A high redundancy disk group must contain at least three failure groups. Oracle ASM automatically stores three copies of the file extents, with each file extent in separate failure groups.

System reliability can diminish if your environment has an insufficient number of failure groups. A small number of failure groups, or failure groups of uneven capacity, can lead to allocation problems that prevent full use of all available storage.

1.3.2.3 Maximum Availability with Oracle ASM

Oracle recommends high redundancy Oracle ASM disk groups, and file placement configuration which can be automatically deployed using Oracle Exadata Deployment Assistant.

High redundancy can be configured for DATA, RECO, or any other Oracle ASM disk group with a minimum of 3 storage cells. Starting with Exadata Software release 12.1.2.3.0, the voting disks can reside in a high redundancy disk group, and additional quorum disks (essentially equivalent to voting disks) can reside on database servers if there are fewer than 5 Exadata storage cells.

Maximum availability architecture (MAA) best practice uses two main Oracle ASM disk groups: DATA and RECO. The disk groups are organized as follows:

  • The disk groups are striped across all disks and Oracle Exadata Storage Servers to maximize I/O bandwidth and performance, and to simplify management.
  • The DATA and RECO disk groups are configured for high (3-way) redundancy.

The benefits of high redundancy disk groups are illustrated by the following outage scenarios:

  • Double partner disk failure: Protection against loss of the database and Oracle ASM disk group due to a disk failure followed by a second partner disk failure.
  • Disk failure when Oracle Exadata Storage Server is offline: Protection against loss of the database and Oracle ASM disk group when a storage server is offline and one of the storage server's partner disks fails. The storage server may be offline because of Exadata storage planned maintenance, such as Exadata rolling storage server patching.
  • Disk failure followed by disk sector corruption: Protection against data loss and I/O errors when latent disk sector corruptions exist and a partner storage disk is unavailable either due to planned maintenance or disk failure.

If the voting disk resides in a high redundancy disk group that is part of the default Exadata high redundancy deployment, the cluster and database will remain available for the above failure scenarios. If the voting disk resides on a normal redundancy disk group, then the database cluster will fail and the database has to be restarted. You can eliminate that risk by moving the voting disks to a high redundancy disk group and creating additional quorum disks on database servers.

Oracle recommends high redundancy for ALL (DATA and RECO) disk groups because it provides maximum application availability against storage failures and operational simplicity during a storage outage. In contrast, if all disk groups were configured with normal redundancy and two partner disks fail, all clusters and databases on Exadata will fail and you will lose all your data (normal redundancy does not survive double partner disk failures). Other than better storage protection, the major difference between high redundancy and normal redundancy is the amount of usable storage and write I/Os. High redundancy requires more space, and has three write I/Os instead of two. The additional write I/O normally has negligible impact with Exadata smart write-back flash cache.

The following table describes that redundancy option, as well as others, and the relative availability trade-offs. The table assumes that voting disks reside in high redundancy disk group. Refer to Oracle Exadata Database Machine Maintenance Guide to migrate voting disks to high redundancy disk group for existing high redundancy disk group configurations.

Redundancy Option Availability Implications Recommendation

High redundancy for ALL (DATA and RECO)

Zero application downtime and zero data loss for the preceding storage outage scenarios if voting disks reside in high redundancy disk group.

If voting disks currently reside in normal redundancy disk group, refer to Oracle Exadata Database Machine Maintenance Guide to migrate them to high redundancy disk group.

Use this option for best storage protection and operational simplicity for mission-critical applications. Requires more space for higher redundancy.

High redundancy for DATA only

Zero application downtime and zero data loss for preceding storage outage scenarios. This option requires an alternative archive destination.

Use this option for best storage protection for DATA with slightly higher operational complexity. More available space than high redundancy for ALL.

Refer to My Oracle Support note 2059780.1 for details.

High redundancy for RECO only

Zero data loss for the preceding storage outage scenarios.

Use this option when longer recovery times are acceptable for the preceding storage outage scenarios. Recovery options include the following:

  • Restore and recover:

    - Recreate DATA disk group

    - Restore from RECO and tape-based backups, if required

    - Recover database

  • Switch and recover:

    - Use RMAN switch to copy

    Recover database

Normal Redundancy for ALL (DATA and RECO)

Note: Cross-disk mirror isolation by using ASM disk group content type limits an outage to a single disk group when two disk partners are lost in a normal redundancy group that share physical disks and storage servers.

The preceding storage outage scenarios resulted in failure of all Oracle ASM disk groups. However, using cross-disk group mirror isolation the outage is limited to one disk group.

Note: This option is not available for eighth or quarter racks.

Oracle recommends a minimum of high redundancy for DATA only.

Use the Normal Redundancy for ALL option when the primary database is protected by an Oracle Data Guard standby database deployed on a separate Oracle Exadata Database Machine or when the Exadata Database Machine is servicing only development or test databases. Oracle Data Guard provides real-time data protection and fast failover for storage failures.

If Oracle Data Guard is not available and the DATA or RECO disk groups are lost, then leverage recovery options described in My Oracle Support note 1339373.1.

The optimal file placement for setup for MAA is:

  • Oracle Database files — DATA disk group
  • Flashback log files, archived redo files, and backup files — RECO disk group
  • Redo log files — First high redundancy disk group. If no high redundancy disk group exists, then redo log files are multiplexed across the DATA and RECO disk groups.
  • Control files — First high redundancy disk group. If no high redundancy disk groups exist, the use one control file in the DATA disk group. The backup control files should reside in the RECO disk group, and RMAN CONFIGURE CONTROLFILE AUTOBACKUP ON should be set.
  • Server parameter files (SPFILE) — First high redundancy disk group. If no high redundancy disk group exists, then SPFILE should reside in the DATA disk group. SPFILE backups should reside in the RECO disk group.
  • Oracle Cluster Registry (OCR) and voting disks for Oracle Exadata Database Machine Full Rack and Oracle Exadata Database Machine Half Rack — First high redundancy disk group. If no high redundancy disk group exists, then the files should reside in the DATA disk group.
  • Voting disks for Oracle Exadata Database Machine Quarter Rack or Eighth Rack — First high redundancy disk group, otherwise in normal redundancy disk group. If there are fewer than 5 Exadata storage cells with high redundancy disk group, additional quorum disks will be stored on Exadata database servers during OEDA deployment. Refer to Oracle Exadata Database Machine Maintenance Guide to migrate voting disks to high redundancy disk group for existing high redundancy disk group configurations.
  • Temporary files — First normal redundancy disk group. If the high redundancy for ALL option is used, then the use the first high redundancy disk group.
  • Staging and non-database files — DBFS disk group or ACFS volume
  • Oracle software (including audit and diagnostic destinations) — Exadata database server local file system locations configured during OEDA deployment

1.3.3 About Grid RAID

A grid Redundant Array of Independent Disks (RAID) configuration uses Oracle ASM mirroring capabilities.

To use grid RAID, you place grid disks in an Oracle ASM disk group with a normal or high redundancy level, and set all grid disks in the same cell to be in the same Oracle ASM failure group. This ensures that Oracle ASM does not mirror data extents using disks within the cell. Using disks from different cells ensures that an individual cell failure does not cause the data to be unavailable.

Grid RAID also provides simplified creation of cell disks. With grid RAID, LUNs are automatically created from available physical disks because Oracle software automatically creates the required LUNs.

1.3.4 About Storage Server Security

Security for Exadata Storage Servers is enforced by identifying which clients can access storage servers and grid disks.

Clients include Oracle ASM instances, database instances, and clusters. When creating or modifying grid disks, you can configure the Oracle ASM owner and the database clients that are allowed to use those grid disks.

1.3.5 About iDB Protocol

The iDB protocol is a unique Oracle data transfer protocol that serves as the communications protocol among Oracle ASM, database instances, and storage cells.

General-purpose data transfer protocols operate only on the low-level blocks of a disk. In contrast, the iDB protocol is aware of the Oracle internal data representation and is the necessary complement to Exadata storage server specific features, such as predicate processing offload.

In addition, the iDB protocol provides interconnection bandwidth aggregation and failover.

1.3.6 About Oracle Exadata System Software Processes

Oracle Exadata System Software includes the following software processes:

  • Cell Server (CELLSRV) is the primary Exadata storage server software component and provides the majority of Exadata storage services. It serves simple block requests, such as database buffer cache reads, and facilitates Smart Scan requests, such as table scans with projections and filters. CELLSRV implements Exadata I/O Resource Management (IORM), which works in conjunction with Oracle Database resource management to meter out I/O bandwidth to the various databases and consumer groups that are issuing I/Os. CELLSRV also collects numerous statistics relating to its operations. CELLSRV is a multithreaded server and typically uses the largest portion of CPU resources on a storage server.

  • Offload server (CELLOFLSRV <version>) is a helper process to CELLSRV that processes offload requests from a specific database version. The offload servers enable each storage server to support all offload operations from multiple database versions. The offload servers automatically run conjunction with CELLSRV, and they require no additional configuration or maintenance.

  • Management Server (MS) provides standalone Oracle Exadata System Software management and configuration functions. MS is also responsible for sending alerts, and collects some statistics in addition to those collected by CELLSRV.

  • Restart Server (RS) monitors the CELLSRV, offload server, and MS processes and restarts them, if necessary.

1.3.7 About Cell Management

Each cell in the Oracle Exadata Storage Server grid is individually managed with Cell Control Command-Line Interface (CellCLI).

The CellCLI utility provides a command-line interface to the cell management functions, such as cell initial configuration, cell disk and grid disk creation, and performance monitoring. The CellCLI utility runs on the cell, and is accessible from a client computer that has network access to the storage cell or is directly connected to the cell. The CellCLI utility communicates with Management Server to administer the storage cell.

To access the cell, you should either use Secure Shell (SSH) access, or local access, for example, through a KVM switch (keyboard, video or visual display unit, mouse) switch. SSH allows remote access, but local access might be necessary during the initial configuration when the cell is not yet configured for the network. With local access, you have access to the cell operating system shell prompt and use various tools, such as the CellCLI utility, to administer the cell.

You can run the same CellCLI commands remotely on multiple cells with the dcli utility.

To manage a cell remotely from a compute node, you can use the ExaCLI utility. ExaCLI enables you to run most CellCLI commands on a cell. This is necessary if you do not have direct access to a cell to run CellCLI, or if SSH service on the cell has been disabled. To run commands on multiple cells remotely, you can use the exadcli utility.

See Also:

  • Using the ExaCLI Utility in Oracle Exadata Database Machine Maintenance Guide, for additional information about managing cells remotely

  • Using the exadcli Utility in Oracle Exadata Database Machine Maintenance Guide, for additional information about managing multiple cells remotely

1.3.8 About Database Server Software

Oracle software is installed on the Exadata database servers.

Oracle Exadata System Software works seamlessly with Oracle Database. The software on the database servers includes:

  • Oracle Database instance, which contains the set of Oracle Database background processes that operate on the stored data and the shared allocated memory that those processes use to do their work. The database server software also includes utilities for administration, performance management, and support of the database.

  • Oracle Automatic Storage Management (Oracle ASM), is a clustered file system and volume manager which provides storage management optimized for the database and Oracle Exadata Storage Servers. Oracle ASM is part of Oracle Grid Infrastructure. The Oracle Grid Infrastructure software provides the essential functions to maintain cluster coherence for all the Exadata servers. The Oracle Grid Infrastructure software also monitors the health and liveness of both database and storage servers, providing database high availability in case of planned and unplanned storage outages.

    The Oracle ASM instance handles placement of data files on disks, operating as a metadata manager. The Oracle ASM instance is primarily active during file creation and extension, or during disk rebalancing following a configuration change. Run-time I/O operations are sent directly from the database to storage cells without passing through an Oracle ASM instance.

  • The Oracle Database Resource Manager, which ensures that I/O resources are properly allocated within a database.

  • The iDB protocol is used by the database instance to communicate with cells, and is implemented in an Oracle-supplied library statically linked with the database server.

1.3.9 About Oracle Enterprise Manager for Oracle Exadata Database Machine

Oracle Enterprise Manager provides a complete target that enables you to monitor Oracle Exadata Database Machine, including configuration and performance, in a graphical user interface (GUI).

The following figure shows the Exadata Storage Server Grid home page. Viewing this page, you can quickly see the health of the storage servers, key storage performance characteristics, and resource utilization of storage by individual databases.

Figure 1-5 Exadata Storage Server Grid home page in Oracle Enterprise Manager

Description of Figure 1-5 follows
Description of "Figure 1-5 Exadata Storage Server Grid home page in Oracle Enterprise Manager"

In addition to reports, Oracle Enterprise Manager enables you to set metric thresholds for alerts and monitor metric values to determine the health of your Exadata systems.