Note:
View a video of how to manage and monitor your Oracle Exadata Database Machine using Oracle Enterprise Manager Cloud Control 13c:
https://youtu.be/5S0mIx6cegE
To manage the plug-in, you need to create roles and administrators, and then assign roles to administrators. This restricts the privileges that each user has, for example in deleting the plug-in or accessing reports.
Note:
For security reasons, Oracle recommends that the SYSMAN
account be used only as a template to create other accounts, and not used directly.
To create roles to provide management rights to users:
When the newly created administrator logs in, unlike SYSMAN
, the administrator is restricted by the privileges set.
Note:
For more information and details, see the Working with Systems Infrastructure Targets chapter of the Oracle® Enterprise Manager Cloud Control Administrator's Guide.
Database Machine management simplifies monitoring and managing tasks by integrating all hardware and software components into one entity. You do not need to monitor each target individually, but instead you can view the whole Exadata Database Machine as a single target. You can view all critical issues in the system, monitor performance, and drill down to individual targets from the Database Machine target home page.
The following topology topics are presented in this section:
Use the Topology page of Database Machine to view the topology of the system by Cluster or by Database. Clusters are a complete software system starting with a RAC database, the underlying ASM, and CRS. Clusters define one logical entity that is interconnected. The Database Machine could include several clusters, one cluster, or could just be a number of individual databases. While cabinets define the hardware topology of the Database Machine, clusters define the logical or system topology of the Database Machine.
You can view the Topology by Cluster or Database. Click an element in the Topology and view alert data associated with the element.
You can monitor all components of the Database Machine. Database Machine monitors all subcomponent targets, whether hardware or software. This includes the database, ASM, CRS, hosts, Exadata and the InfiniBand network.
To view the topology of an existing Database Machine target:
You can drill down immediately to a subcomponent target of the Database Machine (such as RAC, a database instance, or an Exadata cell).
To drill down to individual targets:
You can view critical metrics for all the hardware subcomponents of the Database Machine such as DB hosts, Exadata Storage Servers, InfiniBand switches and so on. These metrics vary for different component targets. For example, database server nodes and Exadata servers include the CPU, I/O, and storage metrics.
To view critical hardware-centric information for the entire Database machine:
You can add Exadata components manually using the following steps:
Remove all members of the Oracle Exadata Database Machine. If the member target is shared with another Database Machine target, then the member target will not be deleted and will continue to be monitored. In other words, the member targets will be deleted if they are associated with only this Database Machine target.
Remove only system members of the Oracle Exadata Database Machine. The other member targets will not be deleted and will continue to be monitored. They can be associated to another Oracle Exadata Database Machine, if required.
To remove an Exadata Database Machine target:
Note:
Host targets for the compute nodes and any targets that are also member targets of another Oracle Exadata Database Machine target will not be removed. System and non-system targets include:
System Targets:
Oracle Exadata Database Machine
Oracle Infiniband Network (Enterprise Manager 12c target)
Oracle Exadata Storage Server Grid
Non-System Targets:
Oracle Exadata Storage Server
Oracle Exadata KVM
Systems Infrastructure Switch
Systems Infrastructure PDU
Systems Infrastructure Rack
Oracle Infiniband Switch (Enterprise Manager 12c target)
Oracle Engineered System Cisco Switch (Enterprise Manager 12c target)
Oracle Engineered System PDU (Enterprise Manager 12c target)
Oracle Engineered System ILOM Server (Enterprise Manager 12c target)
If you need to remove a component of an Exadata Database Machine target, you can perform this task within Enterprise Manager Cloud Control 13c:
In some cases, the Exadata Database Machine schematic diagram is not displaying the components correctly. For example:
You may have successfully discovered the Exadata Database Machine, but some components are not displaying correctly in the Exadata schematic diagram. Instead, an empty slot is shown in place of the component.
The Exadata Database Machine schematic diagram shows shows the status of the component as "red/down" where as individual components would really show that they are up and running fine.
You want to re-locate or rearrange the order of the components in the slots of Exadata Database Machine schematic diagram.
To accomplish these tasks, you will need to drop a component from the schematic diagram and add the correct one:
To drop a component from the Exadata Database Machine schematic diagram:
From the Targets menu, select Exadata.
On the Exadata Database Machine schematic diagram, click Edit as shown in Figure 5-1:
Figure 5-1 Schematic Diagram Edit Button
Right-click on the component you want to drop. In the pop-up window, select Delete Component as shown in Figure 5-2:
Figure 5-2 Delete Component
Ensure you have selected the correct component in pop-up shown and click OK as shown in Figure 5-3:
Figure 5-3 Confirm Delete
The Exadata Database Machine schematic diagram will refresh to show the empty slot, as shown in Figure 5-4:
Figure 5-4 Refreshed Schematic Diagram with Empty Slot
Once you see the component deleted in the slot you specify, click Done on the Exadata Database Machine schematic diagram.
This section provides introductory instructions for managing Exadata Storage Servers. The following topics are presented:
An Exadata Storage Server is a network-accessible storage array with Exadata software installed on it. Use the Exadata Home page to manage and monitor the Oracle Exadata Storage Server (also known as Exadata cell) by managing it as an Enterprise Manager Cloud Control target. You can discover and consolidate management, monitoring and administration of a single or a group of Oracle Exadata Storage Servers in a datacenter using Enterprise Manager.
Exadata Storage Servers can be discovered automatically or manually. Once discovered, you can add them as Enterprise Manager targets. The individual Exadata Storage Server is monitored and managed as an Enterprise Manager target and provides the exception, configuration and performance information.
Grouping of Exadata Storage Servers is used for easy management and monitoring of the set of Storage Servers. You can group them both manually and automatically. The grouping function provides an aggregation of exceptions, configuration and performance information of the group of cells.
You can view performance analysis by linking Exadata performance both at a cell level and group level to ASM and database performance. You can drill down to Exadata configuration and performance issues from both the database and ASM targets.
Storage Grid (for example, multiple database/ASM instances sharing the same Exadata Storage Server) is supported to the same extent as dedicated storage.
Use Oracle Exadata to manage and monitor the Oracle Exadata Storage Server (also known as Exadata cell) by managing the Exadata cells as Enterprise Manager Cloud Control targets. You can discover and consolidate management, monitoring and administration of a single or a group of Oracle Exadata Storage Servers in a datacenter using Enterprise Manager.
Exadata cells can be discovered automatically or manually. Once discovered, you can add the Exadata cells as Enterprise Manager targets.
The individual Exadata cell is monitored and managed as an Enterprise Manager target and provides the exception, configuration and performance information.
Grouping of Exadata cells is used for easy management and monitoring of the set of Exadata cells. You can group Exadata cells both manually and automatically. The grouping function provides an aggregation of exceptions, configuration and performance information of the group of cells.
You can view performance analysis by linking Exadata performance both at a cell level and group level to ASM and database performance. You can drill down to Exadata configuration and performance issues from both the database and ASM targets.
You can view the configuration of an Oracle Exadata Storage Server target by following the steps below:
To perform an administration operation on an Exadata Storage Server, such as executing a cell command, follow these steps:
Oracle Exadata Database Machine Cells are added as targets during the database machine discovery workflow (see Exadata Database Machine Discovery) and are grouped automatically under the group Exadata Storage Server Grid.
To access the IORM Performance page:
Select an Exadata Storage Server cell. One way to select the cell:
From the Targets menu, select Exadata.
Select a DB Machine from the list of Target Names.
In the Target Navigation pane, expand the Exadata Grid item and click one of the cells.
Once you have selected an Exadata Storage Server cell, click the Exadata Storage Server menu, select Administration, then Manage IO Resource.
Once you have accessed the IORM page, you can make the following modifications:
The IORM Monitoring section of the page provides a view of the performance statistics of Disk IO (Wait, IOPS, MBPS, Utilization, Latency, and Objective charts). These statistics help to identify which databases and consumer groups are using the available resources. They also help to adjust the IORM configuration (using IORM Settings section on the same page) as needed.
For further details on managing I/O resources, refer to the Managing I/O Resources chapter in the Oracle® Exadata Storage Server Software User's Guide.
To update the I/O Resource Manager (IORM) settings (for Exadata Storage Server software release 12.1.2.1.0 and later):
Navigate to the IORM Performance page as described above. Figure 5-8 shows the I/O Resource Manager (IORM) Settings pane.
Figure 5-8 I/O Resource Manager (IORM) Settings
Note:
You can also update a single cell. Expand the Exadata Grid group to view all cells associated with the group. Click the cell you want to update.
The steps to update the IORM settings is the same for a single cell or group of cells.
From the Database Name column, select a database from the drop-down menu.
Enter a value for the Hard Disk I/O Utilization Limit column.
Enter a value for the Database I/O Share column.
Enter minimum and maximum values (in MB) for the Flash Cache column.
In the Disk I/O Objective drop-down menu, select an objective from the list (Auto is the default):
Low Latency - Use this setting for critical OLTP workloads that require extremely good disk latency. This setting provides the lowest possible latency by significantly limiting disk utilization.
Balanced - Use this setting for critical OLTP and DSS workloads. This setting balances low disk latency and high throughput. This setting limits disk utilization of large I/Os to a lesser extent than Low Latency to achieve a balance between good latency and good throughput.
High Throughput - Use this setting for critical DSS workloads that require high throughput.
Auto - Use this setting to have IORM determine the optimization objective. IORM continuously and dynamically determines the optimization objective, based on the workloads observed, and resource plans enabled.
Basic - Use this setting to disable I/O prioritization and limit the maximum small I/O latency.
Click Update. The Exadata Cell Administration Wizard will appear prompting you for the information necessary to complete the Disk I/O Objective configuration:
On the Command page, the Cell Control Command-Line Interface (CellCLI) value should be:
# alter iormplan objective = 'auto'
Click Next.
On the Admin Credentials page, enter the username and password for the selected cells.
Click Next.
On the Schedule page, enter a job name (required) and job description (optional). Select an option to start Immediately or Later. If you select the Later option, enter the time you want the job to run.
Click Next.
On the Review page, verify the settings are correct. If there are no changes, click Submit Command.
Once the job is successfully submitted, the Job Status page will display.
Click Return to return to the I/O Resource Manager (IORM) Settings pane.
Click Get Latest to refresh the page, which will include your Disk I/O Objective selection.
Confirm the IORM objective settings. From the command line, run the following command:
# dcli -g cell_group cellcli -e "list iormplan attributes objective"
Output should show a value of auto:
cell01: auto cell02: auto cell03: auto . . . cell14: auto
An inter-database plan specifies how resources are allocated by percentage or share among multiple databases for each cell. The directives in an inter-database plan specify allocations to databases, rather than consumer groups. The inter-database plan is configured and enabled with the CellCLI utility at each cell.
The inter-database plan is similar to a database resource plan, in that each directive consists of an allocation amount and a level from 1 to 8. For a given plan, the total allocations at any level must be less than or equal to 100 percent. An inter-database plan differs from a database resource plan in that it cannot contain subplans and only contains I/O resource directives. Only one inter-database plan can be active on a cell at any given time.
You can view the current configured inter-database plan and update an existing Percentage/Share based inter-database plan and a new Percentage/Share based plan can be configured using the Add/Remove options.
You can also view Share, Percentage Radio buttons and a drop down with Basic, Advance options.
Note:
If the Exadata plug-in version is 12.1.0.3.0 and earlier or if the Exadata Storage Server version is 11.2.3.1.0 or earlier, the Share, Percentage based inter-database plan radio buttons are not available. You can view only Percentage-based options (that is, the drop-down only displays the Basic, Advance options).
When considering an inter-database plan:
If Oracle Exadata Storage Server is only hosting one database, then an inter-database plan is not needed.
If an inter-database plan is not specified, then all databases receive an equal allocation.
For further details on the inter-database plan, refer to the About Interdatabase Resource Management section in the Oracle® Exadata Storage Server Software User's Guide.
Enterprise Manager listens for Exadata Cell alerts sent from the Exadata Cell Management Server; so, any hardware failure or cell error will be reported in Enterprise Manager. For detailed cell error code and its interpretation, refer to the Hardware Alert Messages section in Appendix B, "Alerts and Error Messages" of the Oracle® Exadata Storage Server Software User's Guide.
All InfiniBand Switches are discovered automatically during the database machine discovery workflow (see Exadata Database Machine Discovery) and are grouped automatically under the group IB Network.
The following topics address managing your InfiniBand network:
The following metrics are available for your InfiniBand Network:
The Aggregate Sensor takes input from multiple sensors and aggregates the data to identify problems with the switch that require attention. Whenever the sensor trips into an "Asserted" state (indicating a problem) or "Deasserted" (indicating that the problem is cleared) for a component on the switch, associated Enterprise Manager events will be generated.
This is the main metric indicating availability of the InfiniBand switch. It is collected every 60 seconds by default through the management interface of the switch.
To perform an administration operation on an Infiniband Network, follow these steps:
The Oracle Exadata plug-in release 13.1.0.1.0 and later provides flash cache resource monitoring for Oracle Exadata Storage Servers. From the Storage Server home page or from the IO Distribution Detail page, Cloud Control provides a high-level overview of flash cache resources (Figure 5-9). Details include:
I/O Utilization (as a percentage).
Hard Drive I/O Time Breakdown (in milliseconds per request).
Flash I/O Time Breakdown (in milliseconds per request).
Figure 5-9 I/O Distribution by Databases
From the IO Distribution Details page, which provides a view of all available databases statistics, select Table View (the default is Graph View) to view the data as a table (Figure 5-10):
Figure 5-10 I/O Distribution by Databases - Table View
The IORM Performance Page (Figure 5-11) provides detailed metrics, such as Average Throttle Time for Disk I/Os for both hard drives and flash drives. Select Flash Cache Space Usage (Figure 5-12) for detailed performance about flash cache space.
Figure 5-11 IORM Performance Page
Figure 5-12 IORM Performance - Flash Cache Space Usage
Oracle Enterprise Manager Cloud Control provides hardware fault monitoring for Oracle Exadata Database Machine. Table 5-1 shows the fault monitoring for the Exadata Storage Server. Table 5-2 shows the fault monitoring for the compute nodes.
Note:
The fault monitoring information in the following tables, while comprehensive, may not always be complete. As new fault monitoring functionality is added, these tables will be updated accordingly.
Table 5-1 Exadata Storage Server Fault Monitoring
Area | Fault Monitoring |
---|---|
Access |
Cell cannot be accessed (e.g. ping failure). |
Memory |
Memory Controller Error Memory DIMM Error Memory Channel Error Memory DIMM Temperature Sensor Error Memory Correctable ECC Error |
CPU |
Internal Processor Error Intel 5500 Chipset Core Error Data Cache Error Cache Error Instruction Cache Error |
ESM |
ESM Battery Charge Error ESM Battery Life Error |
Hard Disk |
SCSI Error for Media, Device Disk Temperature Threshold Excess Physical Disk Not Present Block Corruption |
Flash Disk |
Flash Disk Failure Flash Disk Predictive Failure Flash Disk not Present |
Miscellaneous |
Chassis or Power Supply Fan Error PCI-E Internal Error Power Supply Voltage Excess Error Temperature Excess Error Network Port Disconnection and Fault |
Table 5-2 Compute Node Fault Monitoring
Area | Fault Monitoring |
---|---|
Memory |
Memory Controller Error Memory DIMM Error Memory Channel Error Memory DIMM Temperature Sensor Error Memory Correctable ECC Error |
CPU |
Internal Processor Error Intel 5500 Chipset Core Error Data Cache Error Cache Error Instruction Cache Error |
Disk |
SCSI Error for Media, Device Disk Temperature Threshold Excess Block Corruption |
Miscellaneous |
Chassis or Power Supply Fan Error Non-fatal PCI-E Internal Error Power Supply Voltage Excess Error Temperature Excess Error Network Port Disconnection and Fault |
Enterprise Manager collects details for the following components:
An Enterprise Manager Agent runs the cellcli
command via ssh
to collect Storage Cell metrics. SNMP traps are sent to the Enterprise Manager Agent for subscribed alert conditions.
Monitoring requires the cellmonitor ssh eq
setup with an Agent user.
ASM targets and disk groups are associated.
On the home page, rich storage data is collected, including:
Aggregate storage metrics.
Cell alerts via SNMP (PUSH).
Capacities.
IORM consumer and Database-level metrics.
An Enterprise Manager Agent runs remote ssh
calls to the InfiniBand switch to collect metrics. The InfiniBand Switch sends SNMP traps (PUSH) for all alerts.
Monitoring requires ssh eq.
for the nm2user
for metric collections such as:
Response
Various sensor status
Fan
Voltage
Temperature
Port performance data
Port administration
An Enterprise Manager Agent runs a remote SNMP get
call to collect metric data for the Cisco switch, including details on:
Status / Availability
Port status
Vital signs: CPU, Memory, Power, Temperature
Network interface various data
Incoming traffic errors, traffic kb/s and %
Outgoing traffic errors, traffic kb/s and %
Administration and Operational bandwidth Mb/s
An Enterprise Manager Agent runs remote ipmitool
calls to each Compute Node ILOM target. Monitoring requires the nm2user
user credentials to run ipmitool
.
The following details are collected:
Response – availability
Sensor alerts
Temperature
Voltage
Fan speeds
Configuration Data: Firmware version, serial number, and so forth.
An Enterprise Manager Agent runs remote SNMP get
calls and receives SNMP traps (PUSH) from each PDU. Collected details include:
Response and ping
status.
Phase values.