Using Analytics to examine CPU utilization and NFSv3 operation latency
This is the main interface for Analytics. See Concepts for an overview of Analytics.
A worksheet is a view where multiple statistics may be graphed. The screenshot at the top of this page shows two statistics:
CPU: percent utilization broken down by CPU identifier - as a graph
Protocol: NFSv3 operations per second broken down by latency - as a quantize plot
Click the screenshot for a larger view. The following sections introduce Analytics features based on that screenshot.
The CPU utilization statistic in the screenshot is rendered as a graph. Graphs provide the following features:
The left panel lists components of the graph, if available. Since this graph was "... broken down by CPU identifier", the left panel lists CPU identifiers. Only components which had activity in the visible window (or selected time) will be listed on the left.
Left panel components can be clicked to highlight their data in the main plot window.
Left panel components can be shift clicked to highlight multiple components at a time (such as in this example, with all four CPU identifiers highlighted).
Left panel components can be right clicked to show available drilldowns.
Only ten left panel components will be shown to begin with, followed by "...". You can click the "..." to reveal more. Keep clicking to expand the list completely.
The graph window on the right can be clicked to highlight a point in time. In the example screenshot, 15:52:26 was selected. Click the pause button followed by the zoom icon to zoom into the selected time. Click the time text to remove the vertical time bar.
If a point in time is highlighted, the left panel of components will list details for that point in time only. Note that the text above the left box reads "At 15:52:26:", to indicate what the component details are for. If a time wasn't selected, the text would read "Range average:".
Y-axis auto scales to keep the highest point in the graph (except for utilization statistics, where are fixed at 100%).
The line graph button will change this graph to plot just lines without the flood-fill. This may be useful for a couple of reasons: some of the finer detail in line plots can be lost in the flood fill, and so selecting line graphs can improve resolution. This feature can also be used to vertical zoom into component graphs: first, select one or more components on the left, then switch to the line graph.
The NFS latency statistic in the screenshot is rendered as a quantize plot. The name refers to the how the data is collected and displayed. For each statistic update, data is quantized into buckets, which are drawn as blocks on the plot. The more events in that bucket for that second, the darker the block will be drawn.
The example screenshot shows NFSv3 operations were spread out to 9 ms and beyond - with latency on the y-axis - until an event kicked in about half way and the latency dropped to less than 1 ms. Other statistics can be plotted to explain the drop in latency (the filesystem cache hit rate showed steady misses go to zero at this point - a workload had been randomly reading from disk (0 to 9+ ms latency), and switched to reading files that were cached in DRAM.)
Quantize plots are used for I/O latency, I/O offset and I/O size, and provide the following features:
Detailed understanding of data profile (not just the average, maximum or minimum) these visualize all events and promote pattern identification.
Vertical outlier elimination. Without this, the y-axis would always be compressed to include the highest event. Click the crop outliers icon to toggle between different percentages of outlier elimination. Mouse over this icon to see the current value.
Vertical zoom: click a low point from the list in the left box, then shift-click a high point. Now click the crop outliers icon to zoom to this range.
Graphs by filename have a special feature - "Show hierarchy" text will be visible on the left. When clicked, a pie-chart and tree view for the traced filenames will be made available.
The following screenshot shows the hierarchy view:
As with graphs, the left panel will show components based on the statistic break down, which in this example was by filename. Filenames can get a little too long for that left panel - try expanding it by clicking and dragging the divider between it and the graph; or use the hierarchy view.
The hierarchy view provides the following features:
The filesystem may be browsed, by clicking "+" and "-" next to file and directory names.
File and directory names can be clicked, and their component will shown in the main graph.
Shift click pathnames to display multiple components at once, as shown in this screenshot.
The pie chart on the left shows the ratio of each component to the total.
Slices of the pie may be clicked to perform highlighting.
If the graph isn't paused, the data will continue to scroll. The hierarchy view can be refreshed to reflect the data visible in the graph by clicking "Refresh hierarchy".
There is a close button on the right to close the hierarchy view.
The following features are common to graphs and quantize plots:
The height may be expanded. Look for a white line beneath in the middle of the graph, click and drag downwards.
The width will expand to match the size of your browser.
Click and drag the move icon to switch vertical location of the statistics.
Normally graphs are displayed with various colors against a white background. If data is unavailable for any reason the graph will be filled with a pattern to indicate the specific reason for data unavailability:
The gray pattern indicates that the given statistic was not being recorded for the time period indicated. This is either because the user had not yet specified the statistic or because data gathering had been explicitly suspended.
The red pattern indicates that data gathering was unavailable during that period. This is most commonly seen because the system was down during the time period indicated.
The orange pattern indicates an unexpected failure while gathering the given statistic. This can be caused by a number of aberrant conditions. If it is seen persistently or in critical situations, contact your authorized support resource and/or submit a support bundle.
Worksheets can be saved for later viewing. As a side effect, all visible statistics will be archived - meaning that they will continue to save new data after the saved worksheet has been closed.
To save a worksheet, click the "Untitled worksheet" text to name it first, then click "Save" from the local navigation bar. Saved worksheets can be opened and managed from the Saved Worksheets section.
A toolbar of buttons is shown above graphed statistics. The following is a reference for their function:
Mouse over each button to see a tooltip to describe the click behavior.
Viewing analytics statistics is possible from the CLI. See:
Reading Datasets - for listing recent statistics from available datasets.
Saved Worksheets:CLI - for how to dump worksheets in CSV, which may be suitable for automated scripting.
If you'd like to save a worksheet that displays an interesting event, make sure the statistics are paused first (sync all statistics, then hit pause). Otherwise the graphs will continue to scroll, and when you open the worksheet later the event may no longer be on the screen.
If you are analyzing issues after the fact, you will be restricted to the datasets that were already being archived. Visual correlations can be made between them when the time axis is synchronized. If the same pattern is visible in different statistics - there is a good chance that it is related activity.
Be patient when zooming out to the month view and longer. Analytics is clever about managing long period data, however there can still be delays when zooming out to long periods.