About box plots

A box plot (also known as a box-and-whisker plot) is the plotting of data points against horizontal and vertical axes to show the distribution of a continuous variable. In a box plot:

  • The box represents the middle 50% or so of the numeric values.
  • A horizontal line within the rectangle represents the median of all values (specifically, the value that is exactly in the middle of all values).
  • The top end of the box represents the upper quartile (specifically, the median of the ordered set of values that is greater than the overall median).
  • The bottom end of the box represents the lower quartile (specifically, the median of the ordered set of values that is less than the overall median).

The interquartile range, which is the difference between the upper quartile and the lower quartile, is a measure of the spread of the distribution. The relative distances of the upper and lower quartiles from the median describe the shape of the distribution of data.

In the box plot:

  • The whisker above the box plot extends from the upper quartile to the highest actual value that is within the (75th percentile + 1.5 * (interquartile range)).
  • The whisker below the box plot extends from the lower quartile to the lowest actual value that is within the (25th percentile - 1.5 * (interquartile range)).
  • Outliers are plotted as individual points in the graph. An outlier is considered to be a value that falls outside of the upper or lower whisker.

Note:

Points in a box plot are jittered (displayed at small random offsets from the center line). This ensures that if two records have the same value, a point is likely to be displayed for each of them.

Several box plots might be shown in a single graph if you select a secondary variable. If so, a key appears below the box plot to relate the individual box plots to the values of the secondary variable.

For example, a report shows age and gender for each case ID:

A report showing age and gender for each case ID

Depending on the display options that you specify, the graph might look like this:

Graph of a report showing age and gender for each case ID