# Moran’s I

## Introduction

Moran’s I is a “global” measure of spatial autocorrelation across an entire study area. The Moran’s I (Moran, 1948) statistic provides an indication of the degree of linear association between the observation vector (x) and a vector of spatially weighted averages of neighbouring values ($$W_{x}$$), where $$W$$ is the spatial weight matrix which formalises the neighbourhood or contiguity structure of the dataset.
The Moran’s I statistic is computed as:

#### $$I_{t} = {{R\sum\nolimits_{i=1}^R \sum\nolimits_{j=1}^R x_{i}x_{j}w_{ij}}\over {R_{b}\sum\nolimits_{i=1}^R x_{i}^2}}$$

where $$R$$ is the number of regions in the dataset; $$R_{b}$$ is the sum of the weights which simplifies to $$R$$ if the spatial weight matrix is row-standardised; $$x_{i}$$ is the variable that we are testing and is measured as deviation from the mean, i.e. $$x_{i} = X_{i} – \overline{X}$$.

The location variable for the area’s proximity is given by $$w_{ij}$$ which is the element from the corresponding spatial weight matrix.

The Moran I statistic can be expressed as a standardised normal z-value for inference purposes, computed by:

#### $$Z_{I} = {{I – E(I)}\over{sd(I)}}$$

where $$I$$ is Moran’s I statistic, $$E(I)$$ is the theoretical mean and $$sd(I)$$ the theoretical standard deviation of Moran’s I statistic. Anselin (1992, p. 134) points out the different assumptions that can be made about the data that effect the calculation of the standard deviation. The component allows for the standard deviation to be calculated under the assumption of randomisation or normality, both of which allow the computed z-value to follow a normal distribution (asymptotically), so its significance can be evaluated by means of a standard normal table.

Alternatively, a user may prefer either a Saddlepoint approximation (Tiefelsdorf, 2002) or an exact test (Tiefelsdorf, 1998) which are also provided. In general these make little difference to the significance of global tests unless $$R$$ is quite small. If the user would prefer a permutation test, this facility is provided in the Moran’s I permutation test component.

The range of possible Moran’s I is between -1 and 1. An estimate of 0 implies no spatial autocorrelation. For a significant estimate the closer it gets to 1, the greater the degree of positive spatial autocorrelation; while the closer it is to -1 indicates stronger negative spatial autocorrelation.

Anselin (1996) shows the relationship between the Moran Scatterplot and the Moran’s I statistic. The Moran Scatterplot visualises and identifies the degree of local spatial instability in spatial association that is present in the Moran’s I statistic. The Moran’s I statistic can be interpreted as the regression coefficient in a bivariate regression of the spatially lagged variable, $$W_{x}$$, on the original variable, $$x$$ (in deviations from the mean). The spatially lagged variable, $$W_{x}$$, is the average of observations at neighbouring locations, that is, locations for which $$w_{ij} \neq 0$$.

This interpretation of Moran’s I easily translates to a bivariate scatterplot with $$x$$ on the x-axis, $$W_{x}$$ on the y axis and Moran’s I being the slope of the linear best fit. The scatterplot is useful in identifying those observations that do not conform, and that differ significantly from the global Moran’s I in magnitude and/or direction. The scatterplot centres on the point where the mean of the variable meets the mean of the lagged variable, and the four quadrants of the plot relative to this point give information about the type of association that is present. The upper right and lower left quadrants represent positive spatial association, while the upper left and lower right quadrants show those observations that have negative spatial association. The densities of each of the quadrants indicate which spatial pattern dominates and also provides information on the distribution of the individual spatial associations and the contribution of each to the global statistic. Further, the plot shows outliers and leverage points and can provide an overall picture of the consistency of the global indicator. The Moran Scatterplot is related to the Local Moran’s I below.

## Inputs

We will explore Moran’s I, looking at the spend on the Pharmaceutical Benefits Scheme across Greater Hobart, to determine if areas with higher spend are grouped together

To do this:

Once you have done this, open the Moran’s I tool (Tools → Spatial Statistics → Moran’s I ) and enter your parameters as shown in the image below. These are also explained further below

• Dataset Input: the dataset that contains the variable(s) to be tested. Here we select Spatialised PBS Data – Hobart
• Spatial Weight Matrix: the matrix which tells us how the areas are related to each other in “nearness”. Here we select Contig SWM Hobart SA2s
• Key Column: The unique identifier for your areas. For this example, select SA2 ID
• Variable: the variable(s) to be tested. Select PBS spend per capita
• Alternative Hypothesis: indicates the alternative hypothesis; can be two.sided, greater, meaning one sided greater than or less, meaning one sided less than. Select two.sided
• Inference – a tick box indicating the assumption under which the variance should be calculated. A tick indicates randomisation, blank indicates normality. Tick the box

Once you have selected the parameters, click Add & Run to execute the tool

## Outputs

Once the tool has successfully finished, tick both boxes and click Display. This will open up two outputs. The first is a data file that you can map, containing the variable and the lagged variable (which is the mean or average value of that variable for all the areas neighbouring that area or row)

The second is a text window with the outputs of your Moran’s I test (shown below) under all of the different conditions. We can see from all of them that the Moran’s I value is both small (close to 0 in all instances) and non significant (P > 0.05), meaning that there is no spatial autocorrelation (clustering) of PBS spend per capita by SA2 across Greater Hobart