4.8 System Diagnostics Data Gathering with sosreports and Oracle ExaWatcher

You can use the sosreport utility and Oracle ExaWatcher to diagnose problems with your system.

Every time a server is started, system-wide configuration information is collected by the sosreport utility, and stored in the /var/log/cellos/sosreports directory. You can generate a new sosreport by running the following command as the root user. The script starts collecting the information 30 minutes after entering the command.

/opt/oracle.cellos/vldrun -script sosreport

In addition, the /opt/oracle.ExaWatcher directory contains the Oracle ExaWatcher system data gathering and reporting utilities. Gathered data is stored in archive subdirectories. The following table describes the data gathered at different intervals by the utility:

Table 4-4 Oracle ExaWatcher Collector Names and Descriptions

Collector Name Description

CellSrvStat

Cell server status.

Diskinfo

I/O statistics of the disk, such as successfully completed reads, merged reads, time spent reading, and so on.

FlashSpace

RAW value of the flash card space.

Minimum interval limit is 300 seconds.

IBCardInfo

(Currently not available for X8M systems)

RDMA Network Fabric card information, and status of InfiniBand Network Fabric ports.

Minimum interval is 300 seconds.

IBprocs

Commands that check the RDMA Network Fabric card status.

Minimum interval is 600 seconds.

Iostat

CPU statistics, and I/O statistics for devices and partitions.

Lsof

Files opened by current processes.

Minimum interval limit is 120 seconds.

MegaRaidFW

MegaRaid firmware information, such as battery information.

Minimum interval is 86400 seconds.

Meminfo

Memory management by the kernel.

Mpstat

Microprocessor statistics.

Netstat

Current network connection statistics.

Ps

Active processes statistics.

RDSinfo

Availability of cell servers.

Interval limit is 30 seconds.

Slabinfo

Caches for frequently-used objects in the kernel.

Top

Dynamic, real-time view of the system.

Vmstat

Virtual memory status.

To use Oracle ExaWatcher, do the following:

  1. As the root user, start the Oracle ExaWatcher processes and service.

    # systemctl start ExaWatcher
  2. Run the Oracle ExaWatcher utility at the root user.

    /opt/oracle.ExaWatcher/ExaWatcher.sh [options]
    

The following options are available for use with the Oracle ExaWatcher utility:

Option Description

No options specified

The utility runs using the default options.

-c | --command 'collector_name ;; "default_command; ... " '

To change the core command to be run on the current group. Only the following core commands can be changed:

CellSrvStat

Iostat

Mpstat

Netstat

Ps

Top

Vmstat

Example: --command 'Vmstat;; "vmstat -a"'

--createconf "config_file_to_create" | null

The utility parses all command line inputs, validates them, and creates a configuration file. If the file path and name is not specified, then the utility overwrites the default configuration file.

-d | --disable "collector_name"

The name of the collector to be disabled on the utility.

Example: --disable "Vmstat"

-e | --end "end_time"

The ending tine for the current group. The default value is 10 years from current time.

Example: --end "11/06/2013 12:01:00"

--fromconf "configuration_file" | null

The configuration file to use with the Oracle ExaWatcher utility. The default configuration files are as follows:

/opt/oracle.ExaWatcher/ExaWatcher.conf for Oracle Linux

-g | --group

Starts a new group for gathering data. Other options can be specified with the group option.

-h | --help

Displays help information.

-i | --interval "interval_length"

The sampling interval for the current group, in seconds. The default value is 5 seconds.

Certain collection modules cannot be run every second because the modules consume resources.

Example: --interval 10

-l | --spacelimit

Sets the limit for the amount of storage space used by the utility. The limit is specified in MB. On database servers, the default is 6 GB. On storage servers, the default is 600 MB.

Example: --spacelimit 900

--lastconf

The most-recent configuration file used with the utility.

Data is not collected when using this option.

--listcmd "Full"|"Nameonly"|"Core"|"CMD"|"Enabled"|null

The information about the command inputs. The following are the options:

Full displays all the information about the commands and samplers.

Nameonly displays all names and if it is enabled.

Core displays only the core sampler information.

CMD displays the name, if it is enabled, and the default commands.

-m | --commandmode {"ALL" | "CORE" | "SELECTED"}

The type of collection modules to run for the current group. The following are the options:

ALL runs all collection modules.

CORE runs only the core collection modules.

SELECTED runs only the specified collection modules.

The default value is ALL.

Example: --commandmode "CORE"

-o | --count "archiving_count"

The archive count of the current group. The default value is 720.

Example: --count 500

-r | --resultdir "result_directory"

The directory path to store the results of the data collection.

Example: --r "/opt/oracle.ExaWatcher/archive"

--stop

To stop the utility and all its processes, and then to zip the data files.

-t | --start "start_time"

The starting time for the current group. The default is 20 seconds from the current time.

Example: --start "11/05/2013 12:00:00"

-u | --customcmd 'sample_name ;; "custom_command;... " '

To include a custom collection module in the current group.

Example: --customcmd 'Lsl; "/bin/ls -l"'

-z | --zip "bzip2" "gzip"

The compression program to use on the collected data. The default program is bzip2.

Example: --zip "gzip"