Classifiers

Introduction

The Classifiers tool allows you to break up a variable into separate classes based on the range of the data. For example, you may want to break up a column of datasets into percentiles, or into standard deviations. This tool allows you to specify the number and type of classifiers and produces a table where the attribute is assigned to that class based on your parameters

Inputs

To show this tool in use, we will classify some poverty data across Tasmania

  • Select Tasmania as your area
  • Select SA2 OECD Indicators: Income, Inequality and Financial Stress 2011 as your dataset, selecting Poverty Rate (Synthetic Data) as your variable

Once you have done this, open the tool (Tools → Statistical Analysis → Classifiers) and enter the parameters as shown below. These are also explained below the image

[Click to Enlarge]

[Click to Enlarge]

  • Dataset Input: This is where you select the dataset with the variable(s) you would like to classify. In this instance, we select SA2 OECD Indicators: Income, Inequality and Financial Stress 2011.
  • Variable Name: This is where you select the variables that you would like to classify. You can select more than one. For this example, we select Poverty Rate (Synthetic Data)
  • Number of Classes: This is where you specify how many different classes you would like to create. We are going to select for this example
  • Type: This is where you specify how you would like to break up the data, whether that is sd (standard deviations) equal (equally sized classes, depending on the range of values), quantile (classes with an equal number of values or observations in each class), fisher, jenks, kmeans, hclust and bclust. For this example, we will choose quantile which will produce 5 classes, each containing 20% of the SA2s based on poverty rates
  • TWGD Performance Statistic: Total Within Group Difference. This is left checked
  • TWGV Performance Statistic: Total Within Group Variance. This is left checked

Once you have entered your parameters, click Add and Run to execute the tool.

Outputs

Once your tool has run, the following pop up dialogue box will appear on your screen. Make sure both boxes are checked and click Display

[Click to Enlarge]

[Click to Enlarge]

This will open two outputs. The Text: Classifiers1 XXX file contains, at the bottom of the window, the upper and lower values for each of the classification bands, together with the TWGD and TWGV statistics.

The Output: Classifiers1 XXX file contains three new columns – the number of the band that each area falls into (in this instance, 1 – 5), the lower limit value of that band, and the upper limit value of that band.

These outputs are shown in the image below

[Click to Enlarge]

[Click to Enlarge]