Moran’s I Scatterplot
Contents
Introduction
A Moran’s I Scatterplot is a convenient way of visualising how spatially clustered or autocorrelated your variable is.
The X axis of the scatter plot represents the standardised Z values of your variable (that is, they’ve been standardised to their Z scores, with a mean of zero, and a standard deviation of 1.)
The Y axis represents the standardised values of the neighbouring values around your point of interest, that is the lagged values. These are calculated according to the spatial weights matrix that you specify. So, for instance, if you specify a contiguous spatial weights matrix, with a first order queen contiguity, the value of the y axis represents the mean value of the variable for all of the areas that share a border with the area of interest.A Moran’s I Scatterplot is divided into four quarters, which, clockwise from top-right are represent values that are: High values surrounded by High values, High values surrounded by Low values, Low values surrounded by Low values and Low values surrounded by High values.
A positive slope indicates positive spatial autocorrelation (high values clustered with high, low clustered with low), while a negative slow indicates negative spatial autocorrelation (low values clustered with high, high values clustered with low).
Inputs
For this tutorial we will model the spatial autocorrelation of socio-economic disadvantage across Melbourne.
To do this:
- Select Melbourne GCCSA as your area
- Select SA2 SEIFA 2011 – The Index of Relative Socio-Economic Disadvantage (IRSD) as your dataset
- Spatialise the dataset that you have just loaded
- Generate a Contiguous Spatial Weights Matrix for the dataset that you have just loaded.
We now want to open the Moran’s I Scatterplot chart (Tools → Charts → Moran’s I Scatterplot). Enter your parameters as shown below (the meanings of all these inputs are explained below), and click Add and Run.
The required parameters for the Moran’s I Scatterplot are explained here:- Dataset: This is the dataset that you would like to run the scatter plot on. It is important to realise that this needs to be a spatialised dataset. This is either a dataset that you’ve obtained in the portal and spatialised, or a shapefile that you’ve uploaded. A dataset that you’ve obtained through the portal that hasn’t subsequently been spatialised will fail. Here we use SPATIALISED SA2 SEIFA 2011 – The Index of Relative Socio-Economic Disadvantage (IRSD)
- Spatial Weights Matrix: You will need to have generated a spatial weights matrix of your spatialised dataset in order for the tool to “understand” how your areas are related in space – that is, whether they are counted as near or not. Here we use CONTIG SWM SA2 SEIFA 2011 – The Index of Relative Socio-Economic Disadvantage (IRSD)
- Key Column: This is how you identify your areas – either using the region code or aggregate code that comes with the original dataset. In this instance, use the field named SA2 Code
- Moran’s I Variable: The variable you’d like to investigate. In this instance use the field named Score
- Moran’s I Alternative Hypothesis: Whether you’d like to test a two sided hypothesis, or one sided (greater/lesser). In this instance, choose two.sided
- Moran’s Inference: Whether you’d like include the Moran’s I statistic on your graph. In this instance, keep the box ticked
- Chart Title: A title for your graph.
- Grid: Whether you’d like to include gridlines on your graph
- Legend: Whether you’d like to include a legend with your graph
- Greyscale: Whether you’d like to have your graph in colour (unchecked) or grey-scale (checked)
Once you have entered your parameters, click Add and Run to execute the tool.
Outputs
Once your tool has finished running, click the Display button that appears in the pop up dialogue box. This will open up a scatterplot that looks something like the image shown below. This graph indicates that there is positive spatial autocorrelation between suburbs based on their Index of Relative Socio-economic Disadvantage in Melbourne (that is, disadvantaged suburbs tend to be found clustered with other disadvantaged suburbs, and vice versa)