# Spatial Regression

## Introduction

When employing data collected across various geographic points, it may not be appropriate to view it as conceptually identical to cross-sectional data on individuals or businesses at a single location. Spatially adjacent observations are likely to exhibit spatial interdependence, owing to dynamics which accompany proximity. This reflects Tobler’s (1970) maxim that ‘everything is related to everything else but near things are more related than distant things’. The presence of spatial autocorrelation in data used in ordinary least squares (OLS) models violates standard statistical techniques that assume independence among observations. Ignoring this dependence between neighbouring regions will lead to inefficient and/or biased regression results (Anselin, 1988a).

## Spatial Autocorrelation Diagnostic Tests

There are a number of diagnostic tests that can be performed to test whether an ordinary least squares (OLS) regression is the most appropriate type of model to use. These tests are carried out on the results of an OLS regression and can determine whether the data being modelled violates the OLS assumptions of independence among observations. Further they also provide insight into which of the spatial econometric models (explained in the next section) is the most appropriate to use.

Tests implemented in the AURIN portal include:

## Spatial Econometric Models

There are a suite of spatial econometric models that have been developed that attempt to overcome the spatial autocorrelation problem and allow econometric modelling to produce unbiased and efficient estimates. Anselin (1988: 34) introduced the model specification which applies to “situations where observations are available for a cross-section of spatial units, at one point in time.” These models are calculated using maximum likelihood estimation.

The issue of model selection techniques (specification strategies) remains contentious in the spatial econometric literature. One viewpoint is that the researcher should not engage in a specification search but rather pre-filter the data, netting out any inherent spatial dependence (for example, Getis, 1995). The spatially-filtered data can then be approached using conventional OLS estimation and the suite of spatial econometric models would not be required.

The alternative approach to ‘filtering’ can be cast in the broader debate common among time series econometricians. Two options appear possible. First, we would proceed with a specific-to-general approach (the so-called ‘classical approach’), which begins with the most simple OLS regression and then uses appropriate Lagrange Multiplier (LM) tests to test a range of ‘added variables’ including the presence of spatial dependence. In this case, the researcher would ultimately choose the model with some highest test value. Second, as an alternative, we might follow the Hendry general-to-specific approach, where the researcher deliberately sets out with an over-parameterised model, which in this context would be include all the spatial effects, and then ‘test down’ using valid simplifying restrictions to the parsimonious form.

Obviously, the user is free to use the components in whatever order they deem fit for their purpose. We would assume most spatial econometricians use the specific-to-general approach whereby they begin with the linear regression model.

Models implemented in the AURIN portal include:

## Other Diagnostic Tests

We saw previously some diagnostic tests that can be used to determine whether there is spatial dependence in the data and also to help point the user in the direction of the most appropriate spatial econometric model to use. Those diagnostic tests use the results from an OLS regression. There are other diagnostic tests that require one or other of the spatial econometric models to be estimated and use its results to similarly assist in determining whether there is spatial dependence in the data. These are a Likelihood Ratio test and a Wald Test. As the test requires the spatial econometric model to have been estimated, the results for these diagnostic tests are contained in the results for the specific spatial econometric model above, but are more fully explained here.

The Likelihood Ratio (LR) test compares the Log Likelihood of the OLS model to the Log Likelihood of the specific spatial econometric model in the following way:

#### $$LR = -2LL_{null} + 2LL_{alter}$$

where $$LL_{null}$$ is the Log Likelihood of the null model (OLS) and $$LL_{alter}$$ is the Log Likelihood of the alternative model. This statistic is asymptotically chi-squared distributed with the number of degrees of freedom being the number of additional parameters in the alternative model.

The Wald test uses the maximum likelihood estimate of the extra parameter ($$\rho$$ or $$\lambda$$) and tests how far they are from zero in standard errors. Like the LR test this statistic is asymptotically chi-squared distributed with the number of degrees of freedom being one. If the test fails to reject the null hypothesis, this suggests that not including the spatially lagged dependent variable or the spatially error term will not substantially harm the fit of that model, since a predictor with a coefficient that is very small relative to its standard error is generally not doing much to help predict the dependent variable.

A final diagnostic test whose results rely on the estimation of a spatial econometric model is a Lagrange Multiplier test for spatial error autocorrelation in the residuals of the spatial econometric model. This is available for the spatial lag model and the spatial Durbin model and tests whether the corresponding model removes the spatial autocorrelation from the residuals. This is similar to the Lagrange Multiplier tests. It is asymptotically chi-squared distributed with one degree of freedom.