Local Moran’s I

Introduction

While the Moran scatter plot provides more detail on the type of spatial clustering, it does not report on the significance of such clustering, or which areas or regions contribute most strongly to the overall ‘global’ clustering statistic. The local Moran’s I statistic has been generated to do this. Anselin (1995) defines a class of local indicators he calls Local Indicators of Spatial Association (LISAs), of which local Moran’s I is one. The local Moran’s statistic offers insight into the behaviour of data at local levels, by providing a decomposition of the Moran’s I global statistic into the degree of spatial association associated with each observation. LISAs serve two purposes in ESDA: they indicate local spatial clusters and they perform sensitivity analysis (identify outliers).

Anselin (1995: 94) defines LISA as:

a. the LISA for each observation gives an indication of the extent of significant spatial clustering of similar values around each observation; and
b. the sum of LISAs for all observations is proportional to a global indicator of spatial association.

As with global measures, LISAs test whether the observed spatial pattern of a variable of interest amongst areas is extreme or is likely or expected, given a random geographic distribution of the variable.
Local Moran’s I is calculated as:

\(I_{i} = x_{i} \sum\limits_{j} w_{ij}x_{j}\)

Where the \(x_{i}\) and \(x_{j}\) are the observations in deviations from the mean and \(w_{ij}\) is the spatial weight matrix element. Positive values of \(I_{i}\) mean that there is a spatial cluster of similar values and negative values represent a spatial cluster of dissimilar values.

Significance testing of the local Moran statistics can be somewhat problematic. Unlike the \(G_{i}\) and \(G_{i}*\) statistics (see below), the local Moran does not conform to a common distribution and so the test under a normality assumption should be treated with caution. Instead Anselin (1995) suggests a conditional randomisation or permutation approach to give so-called pseudo significance levels. Tiefelsdorf (2000) published the exact reference distribution of Moran’s I, but in a later paper recommends an application of the Saddlepoint approximation, as it “outperforms other approximation methods with respect to its accuracy and computational costs” (Tiefelsdorf, 2002: 187). The Local I component in the Tools provides results for a normal distribution, a Saddlepoint approximation of the standard deviate and the exact standard deviate.

Inputs

To compute the Local Moran’s I statistics, we will look at socio-economic data in Adelaide to examine the extent of spatial-autocorrelation.

To do this:

  • Select Adelaide GCCSA as your area
  • Select SA1 SEIFA 2011 – The Index of Relative Socio-Economic Disadvantage (IRSD)  as your dataset, selecting all variables
  • Spatialise the dataset, naming it something like SPATIALISED SEIFA IRSD Adelaide
  • Generate a Contiguous Spatial Weights Matrix for the spatialised dataset, using 1st order Queen contiguity. Name it something like Contig SWM Adelaide SA1s

Once you have done this, open the Getis-Ord Global G tool (Tools → Spatial Statistics → Local Moran’s I) and enter the parameters as they appear in the image below. These are also explained underneath the image

[Click to Enlarge]

[Click to Enlarge]

  • Dataset Input: the dataset that contains the variable(s) to be tested. Here we use the dataset named SPATIALISED SEIFA IRSD Adelaide
  • Spatial Weights Matrix: the spatial weight matrix to be used (described here). In this instance we use the one name Contig SWM Adelaide SA1s
  • Key Column: specify the unique codes for your areas. In this case, we will use SA1 Code
  • Variable: the variable(s) to be tested. Here we use Score
  • Alternative Hypothesis: can be two.sided, greater or less than. For this we have no a priori assumptions, so we select two.sided
  • Spatial Weights Matrix Style: here you specifiy the kind of standardisation that you used for your spatial weights matrix. Our matrix is row.standardised
  • Test indicates the shape of the distribution used to test for significance. As suggested above, we select saddlepoint
  • Significance Level: here you set the level of significance for your test. In this instance we will use the default of 0.05

Once you have entered the parameters, click Add and Run to execute the tool

Outputs

Your output will be a dataset that can be mapped based on a number of the variables produced by the analyses. These are explained below

  • li: The local indicator value for your area
  • Saddlepoint: the saddlepoint value for your area
  • Pr.(Sad):
  • score: the variable that you used for the analysis, for your area
  • score_Lagged: the average value of the variable for the areas surrounding each area
  • score_Scaled: the value of the variable for your area scaled (z score)
  • score_Lagged_Scaled:  the average value of the variable for the areas surrounding each area, scaled (z score)
  • score_map_group: the number representing the group that your area belongs to: 0 = Not Significant, 1 = High surrounded by High; 2 = High surrounded by Low; 3 = Low surrounded by High; and 4 = Low surrounded by low.
  • score_map_group_name: the names of the above groups

You can create a choropleth of these variables. For the image below, we have chosen score_map_group_name. Red indicates areas of high IRSD index scores surrounded by other high scores (low disadvantage), blue indicates high scores surrounded by low scores, green indicates low scores surrounded by high scores and purple indicates low areas surrounded by low scores. Orange represents areas with no statistical significance.

[Click to Enlarge]

[Click to Enlarge]