Box Plot

Introduction

The box plot provides a way of illustrating the “spread” of datapoints for variables.

The box plot is a box that shows:

  • The range (minimum and maximum values)
  • The measure of central tendency (the median)
  • The distribution symmetry (how the data are distributed around the median (the upper and lower quartiles – UQ and LQ)

It is usually used to compare several sets of observations or data. The box plot is oriented in a way that each whisker (which represents minimum and maximum of the values) are vertical and each box represents the first quartile (bottom horizontal line) and the third quartile (top horizontal line) of the values. In addition, the median is represented by a horizontal thick line across the box.

Inputs

To illustrate this tool in use, we will examine the spread of the data for the MBS and PBS spend per capita across Brisbane’s suburbs. To do this:

  • Select Brisbane GCCSA as your area
  • Select SA2 OECD Indicators: MBS and PBS data as your dataset
  • Open the Box Plot tool (Analyse → Tools → Charts → Box Plot) and enter the following parameters as shown in the image below (Each of these are also explained below)
[Click to Enlarge]

[Click to Enlarge]

  • Dataset Input: For this we want to select SA2 OECD Indicators: MBS and PBS data
  • Variable(s)The variables for which we would like to examine their spread. Here we select
    •  MBS spend per capita
    •  PBS spend per capita
  • Use Variable Titles: Check this box to have “human readable names” on your output chart
  • Chart Title: Here we enter the title for the plot. In this instance we have chosen Box Plot: MBS and PBS Spend per capita in Brisbane SA2s
  •  Grid: Select this if you want to choose gridlines for your graph
  •  Greyscale: Select this if you want your graph to be in grey scale, rather than in the default colour

Once you have entered the parameters, click Add and Run

Outputs

Once your tool has run, click the Display button on the pop up dialogue box that appears

Your output should look something like the graph below. This indicates that the median PBS spend per capital is lower than the MBS spend, while the latter also has a much larger range. The spread around the median value is roughly similar for both funding schemes.

[Click to Enlarge]

References

(1) An Introduction to R (3.1.0 (2014-04-10)).
(2) Dodge, Y. (2008) Box Plot, The Concise Encyclopedia of Statistics, pp. 55 – 57, Springer, New York.
(3) Lewandowski, S., & Bolt, S. (2010) Box-and-whisker plot, N. Salkind (Ed.), Encyclopedia of research design, pp. 105 – 109, Thousand Oaks, CA: SAGE Publications, Inc.
(4) Dewey, M. E. (1992) Algorithm AS 272: Box Plots, Journal of the Royal Statistical Society. Series C (Applied Statistics), Vol. 41, No. 1, pp. 274 – 284 , Wiley for the Royal Statistical Society, New York.