Managing the InfiniBand Fabric

All InfiniBand Switches are discovered automatically during the database machine discovery workflow (see Exadata Database Machine Discovery) and are grouped automatically under the group IBFabric@<switch-name>.

Note:

InfiniBand Fabric target is not available for RoCE Exadata.

  1. From the Enterprise Manager home page, select Targets, then Oracle Exadata Database Machines and Cloud Services.
  2. In the Target Navigation pane, select InfiniBand Fabric from the list.
  3. In the IB Fabric pane, you can view an overview and activity summary for all InfiniBand Switches.
  4. Click Refresh for an On Demand refresh of the InfiniBand schematic. Updates reflect the real-time data.

InfiniBand/RoCE Switch Metrics

The Enterprise Manager Agent runs a remote SSH and remote SNMP GET call to collect metric data for the InfiniBand switch. InfiniBand metrics provides operational details such as:
  • Status / Availability
  • Port status
  • Vital signs: CPU, Memory, Power, Temperature
  • Network interface various data
  • Incoming traffic errors, traffic Kb/s and %
  • Outgoing traffic errors, traffic Kb/s and %
  • Administration and Operational bandwidth Mb/s

The following metrics are available for your InfiniBand Fabric:

Switch Aggregated Status

The Aggregate Sensor takes input from multiple sensors and aggregates the data to identify problems with the switch that require attention. Whenever the sensor trips into an "Asserted" state (indicating a problem) or "Deasserted" (indicating that the problem is cleared) for a component on the switch, associated Enterprise Manager events will be generated.

Response

This is the main metric indicating availability of the InfiniBand/RoCE switch. It is collected every 60 seconds by default through the management interface of the switch.

Switch Configuration

This metric captures the switch configuration. The information collected is valuable only to Oracle Support, which will use it to assist in debugging situations.

Switch Basic Status

This metric gives basic status of the switch like Booted on, Locator light status, Power status and overall status of the switch.

Sensor Status

This metric gives the status of various sensors available in the switch like power supply, fan, motherboard, and cooling.

Switch Port Statistics

This metric provides information on number the of incoming and outgoing errors, incoming and outgoing octets.

Component State

This metric gives the state of various components in the switch like Fan, Motherboard, Power Supply and various InfiniBand and Ethernet ports.

Network Port InfiniBand performance

This metric gives performance data of each InfiniBand port.

Performing Administration Tasks on InfiniBand Networks

Note:

Administrative tasks are not allowed to be performed on RoCE switch.

To perform an administration operation on an InfiniBand Network, follow these steps:

  1. Navigate to the Database Machine home page of the InfiniBand Network by choosing the Database Machine for which you want to perform an administrative task from the All Targets page.

    Enterprise Manager displays the Database Machine Home page for the target you selected.

  2. Navigate to System Infrastructure Switch home page for which you want to perform an administrative task.
  3. Go to Administration, and select Switch Operations.
  4. Select the administrative operation you want to execute (Enable/Disable port, Clear performance/Error counters, Switch LED on/off, Set up SNMP subscription). The available operations from which you can select are dependent on the target type and target you selected. Once you choose the operation, you may need to select a value that will appear after choosing the operation.
  5. Click Next to continue.

    Enterprise Manager displays the Credentials & Schedule page. Select or enter the credentials to execute the command. The credentials you enter are used when submitting the operation. You can choose between Preferred Credentials, Named Credentials, and New Credentials. Schedule the administration task. Provide the job information in the Administration Job Schedule section. You can choose to begin the job immediately or enter the time you want the job to begin.

  6. Click Next to continue.

    The Review page appears. Use the Review page to ensure you have entered the correct values and then submit the command. The Review page lists the Job Name, Description, Command to Execute, when the job is Scheduled, the Target Type, and the Selected Target.

  7. Click Submit Command to submit the job.

    When you click Submit Command, a popup is shown if the job is successful. You can go to the Job Detail Page or back to the page from where this wizard was launched.

Setting Up Alerts

After configuring the InfiniBand Switch targets to send SNMP alerts, set up alerts in Enterprise Manager Cloud Control.

  1. Log in to Enterprise Manager Cloud Control.
  2. Click Targets, then All Targets. All discovered targets will display.
  3. In the All Targets page, click Systems Infrastructure Switch.
  4. Click the target you are interested in. The target home page appears.
  5. In the drop-down menu for the Systems Infrastructure Switch, select Monitoring and then Metric and Collections Settings.
  6. In the Metric and Collection Settings page, you can modify metric threshold values, edit monitoring settings for specific metrics, change metric collection schedules, and disable collection of a metric.

    You can modify the thresholds directly in the table or click the edit icon (pencil icon) to access the Edit Advanced Settings page. For more information on the fields displayed in this page and how the thresholds can be modified, click Help from the top-right corner of this page.