Investigating Multiple Datasets

Introduction

For this tutorial we will start to bring together multiple datasets into a single area, and investigate the relationship between them. We will start by looking at the relationship between measures of socio-economic disadvantage and health outcomes.

Selecting geography and data

We begin by selecting our area of study. In this instance we will be looking at the state of Victoria as a whole. If you are interested in another state, select that instead. You can even Map all of Australia at SA2 level.

[Select your area > Australia > States and Territories > Victoria]

Now we need to bring in some data. Open the + Dataset button within the Select your data panel and search under keywords for the following datasets:

  1. SA2 Chronic Disease – Modelled Estimate (PHIDU)
  2. SA2 Summary Measure of Disadvantage (PHIDU) or SA2 SEIFA 2011 – The Index of Relative Socio-Economic Disadvantage (IRSD)

The latter dataset has a number of attributes that can be pulled in when you “shop” for the data. For this exercise we only want to look at the following, so make sure only these are checked in the attribute box:

  • Statistical Area Level 2 Code
  • Statistical Area Level 2 Name
  • Arthritis – Rate per 100
  • Asthma – Rate per 100
  • Chronic Obstructive Pulmonary Disease – Rate per 100
  • Circulatory System Diseases – Rate per 100
  • Diabetes – Rate per 100
  • Females with Mental and Behavioural Problems – Rate per 100
  • High Cholesterol – Rate per 100
  • Hypertension – Rate per 100
  • Males with Mental and Behavioural Problems – Rate per 100
  • Musculoskeletal System Diseases – Rate per 100
  • Persons with Mental and Behavioural Problems – Rate per 100
  • Respiratory System Diseases – Rate per 100

You should end up with your Data panel looking like the following:

tute2

[Click to Enlarge]

Exploratory Mapping

Before getting into any analysis of the different datasets, we will start by producing some maps of the different indices that we have loaded into AURIN. From your data, create a choropleth map of the Index of Relative Socioeconomic Disadvantage across Victoria, entering your parameters as shown below:

seifachoropleth

[Click to Enlarge]

 

[Visualise your data > Maps, Charts and Graphs > Map Visualisations > Choropleth > enter parameters > Add Visualisations > Show/Hide]

Your resultant map should look something like the following:

seifachoroplethmap

[Click to Enlarge]

In this instance, lower SEIFA IRSD scores represent more disadvantaged areas (shown in red), while higher SEIFA IRSD scores represent areas with less relative disadvantage (blue).

Now we can add some additional information onto his map.

Create a choropleth centroid map showing one of the health indicators that you have loaded. This example uses Type 2 diabetes, but you can enter any one of the health indicators. Enter your parameters as shown below:

choroplethcentroid

[Click to Enlarge]

 

[Visualise your data > Maps, Charts and Graphs > Map Visualisations > Choropleth – Centroid > enter parameters > Add Visualisations > Show/Hide]

When you have loaded your map it should look something like this:

choroplethcentroidmap

[Click to enlarge]

 

At the moment it’s difficult to detect any kind of relationship between relative socioeconomic disadvantage and health outcomes, mainly because our geographic scale prevents us from seeing any meaningful pattern. However, if you zoom in more into metropolitan Melbourne, you can see a pattern emerging – areas that are ranked as more socioeconomically disadvantaged (i.e. lower IRSD index scores/red areas) seem to have higher rates of certain chronic illnesses.

[Click to Enlarge]

[Click to Enlarge]

Statistical Analyses

We can use some of the other statistical tools in AURIN to investigate these relationships more formally. To start with we’ll produce a simple scatter plot of the data that we’ve just mapped. In order to do this, we first need to join our datasets up. This can be done first by adding the Merge Aggregated Datasets tool (enter the parameters as shown below), and then by executing that tool.

tabularinnerjoin

[Click to Enlarge]

 

[Tools > Data Manipulation > Merge Aggregated Datasets > enter parameters > Update & Run ]

The outputs of the join should appear in your Data panel as Output inner-join XXX. It’s a good idea to rename this to something a little more user friendly for your future analyses.

Now that you’ve joined the data, you can create a scatter plot to check the relationship between the data that you mapped (in this instance, Socioeconomic disadvantage and rates of Type 2 diabetes) To do this, Follow the prompts and create your own:

[Visualise> Maps, Charts and Graphs > Charts – Interactive > Scatter Plot > enter parameters > Add Visualisations > Show/Hide]

Brushing over points on the scatterplot with your mouse or trackpad will bring up the values for that point, as well as highlighting it on the map. Can you see a clear correlation?