Gini Coefficient

Introduction

The Gini coefficient represents the income or wealth distribution of a nation’s residents, and is the most commonly used measure of inequality. It was developed by the Italian statistician and sociologist Corrado Gini and published in his 1912 paper Variability and Mutability.

A Gini coefficient of zero expresses perfect equality, where all values are the same (for example, where everyone has the same income). A Gini coefficient of 1 (or 100%) expresses maximal inequality among values (e.g., for a large number of people, where only one person has all the income or consumption, and all others have none, the Gini coefficient will be very nearly one). However, a value greater than one may occur if some persons represent negative contribution to the total (for example, having negative income or wealth). For larger groups, values close to or above 1 are very unlikely in practice.

Similar to the Theil Index, the Gini Coefficient provides a measure of the amount of inequality there is in the distribution of your variable across space. If all of the individual areas across the study region that you’re looking at have the same, or similar proportions of a certain variable – say, unemployed people – then there is no inequality (perfect equality) with respect to the distribution of that variable. If there are large differences in the distribution of the variable, then the inequality across your study region is large.

The Gini Coefficient compares the area under a Lorenz curve to the area under a perfect distribution line. It is defined as the ratio of the area between the Lorenz curve and the diagonal of perfect equality, to the area of the triangle below this diagonal as shown in the diagram below. The Lorenz curve is a graphical representation of the cumulative distribution function of some variable, often income.

[Click to Enlarge]

[Click to Enlarge]

The formula for the Gini Coefficient from Sen as cited in Anand (1983):

\(G = {1 \over n} \left (n + 1 – 2 {\sum_{ i=1 }^{n} (n + 1 – i)y_{i} \over \sum_{i = 1}^n y_{i}}\right)\)

where the values \(y_{i}\), \(i = 1\) to \(n\) are the income levels indexed in non-decreasing order and \(n\) is the population size.

For a random sample the numerator is the sample size less one. Hence, for a measure of inequality of unemployment across regions the Gini Coefficient becomes:

\(G(S) = {1 \over r – 1} \left (r + 1 – 2 {\sum_{ i=1 }^r (r + 1 – i)u_{i} \over \sum_{i = 1}^r u_{i}}\right)\)

where \(u_{i}\), \(i = 1\) to \(r\) are the unemployment rates indexed in non-decreasing order and \(r\) is the number of regions. This is a consistent estimator of the Gini Coefficient, though not an unbiased one.

The Gini Coefficient can range from a value of 0 to 1. Where there is perfect equality, the Gini Coefficient is zero and it would imply a Lorenz curve that follows the perfect distribution line. A Gini Coefficient of 1 implies perfect inequality, for example where one region has all the unemployment within your study area

Inputs

To show the Gini coefficient tool in use, we will run it to calculate the coefficients for the distribution of male youth unemployment across NSW

To do this:

  • Select New South Wales as your area
  • Select SA2 OECD Indicators: Unemployment Rates 2011 as your dataset, with the following variables:
    •  SA2 Name
    •  Unemployed males 15 – 24
    •  Males in labour force 15 – 24
    •  Males 15 – 24 unemployment rate
  • First, create a Choropleth of the Males 15 – 24 unemployment rate to get a feel for the distribution of male youth unemployment across NSW. It should look something like the following map
[Click to Enlarge]

[Click to Enlarge]

Once you have done this, open the Gini coefficient tool (Tools → Spatial Statistics → Gini Coefficient) and enter the parameters as shown in the image below. These are also explained underneath the image

[Click to Enlarge]

[Click to Enlarge]

  • Dataset input: This is the dataset that contains the values you would like to include in the Gini coefficient calculation. In this instance we select SA2 OECD Indicators: Unemployment Rates 2011
  • Numerator: This is the column that contains the different counts for the specific variable that you would like to calculate the inequality of distribution for across the study region. In this instance we select Unemployed males 15 – 24
  • Denominator: This is the column that contains the total counts of the sample population that you are taking the numerator from. In this instance we select Males in labour force 15 – 24

Once you have entered the parameters, click Add and Run to execute the tool

Outputs

Once you have run the tool, click the Display button that appears on the pop up dialogue box. This should open up a text box like the one shown below, which has the Gini coefficient value for your variable. In this instance, we have a coefficient of 0.214, which suggests some inequality in the distribution of youth male unemployment in NSW SA2s.

[Click to Enlarge]

[Click to Enlarge]

References

  1. Anand, S. (1983) Inequality and Poverty in Malaysia: Measurement and Decomposition, A World Bank Research Publication. Oxford University Press, New York.
  2. CofFEE Spatial Statistics Tools Help File