Tabular Inner Join

Introduction

In order to be able to use a lot of the statistical tools comparing data from different sources, we need a way of “bringing them together”. Many of the tools require the variables we may be interested in modelling or comparing or computing need to be in the same dataset in order to be recognised.

The tabular inner join does this by taking two tables and creating a join based on a specific key or variable. In the vast majority of instances, this key or variable will be the specific area codes – the variable which identifies which SLA or SA1 or LGA is which. It is therefore vital to remember that you can only join datasets that are at the same geographical level of aggregation

In setting up a join we simply specify which table will be on the right side of the new table, which table will be on the left, and join them according to the area code. The tabular inner join will only keep rows for which there are corresponding rows each of the joining tables.

Inputs

To show the Tabular Inner Join tool in use we will combine datasets from the Northern Territory

  • Select Northern Territory as your area
  • Select the following datasets, with all attributes:
    •  Demographic Variables by SLA for Australia
    •  Socio-economic variables by SLA for Australia
  • Open the Tabular Inner Join Tool (Analyse → Tools → Data Manipulation → Tabular Inner Join) and enter the following parameters:
    •  Left Side: Demographic Variables by SLA for Australia
    •  Left Attribute: sla code
    •  Right Side: Socio-economic variables by SLA for Australia
    •  Right Attribute: sla code
  • Once you’ve entered your parameters click Add and Run to execute the tool

It is important to remember that your specific identifier columns may have different names. These naming conventions are the outputs of the data custodians – it is worthwhile to open each of the datasets prior to joining them and identify the column which contains the keys so that you know which ones to include in the inner join.

[Click to Enlarge]

Outputs

Once you have run your tabular inner join, click the Display button on the pop-up dialogue box that appears to view your joined data (it should look something like below). It will appear in your data panel named Output: inner-join XXX. It is a very good idea to rename this to something more meaningful, as more often than not it will be this dataset which will be used in most of your subsequent work. You might also wish to join additional datasets together, and you need to do them sequentially, so renaming them allows you to remember which datasets are contained within your joins.

[Click to Enlarge]

[Click to Enlarge]