# Generalised Linear Model

## Introduction

In statistics, the generalised linear model (GLM) is a flexible generalisation of ordinary linear regression that allows for response variables that have error distribution models other than a normal distribution. The GLM generalises linear regression by allowing the linear model to be related to the response variable via a link function and by allowing the magnitude of the variance of each measurement to be a function of its predicted value.

## Inputs

To run the GLM, open the tool (Tools → Statistical Analysis → Generalised Linear Model) and enter your parameters, which are explained under the image below

[Click to Expand]

• Dataset input: Here you select the dataset that you would like to include in the GLM
• Formula: This part of the parameter input can be tricky for users not familiar with the R Language syntax, particularly because this formula has to be entered in this format. The basic model is “Y regressed on X”, which is denoted in R as

#### $$Y \sim X$$

where is the name of the dependent variable and is the name of the independent variable, or the variable that you’re testing has an effect on the dependent variable, and the tilde symbol ( ~ ) means “regressed on”. In this format, you can add multiple independent variables, such that you write the model as:

#### $$Y \sim X_{1} + X_{2} + X_{3}$$

Additional components of the model, such as interactive terms, and how to enter them can be found in the link above.

It is important that the names rather than the titles are entered into the formula. These are the “machine readable” names of variables, rather than the “human readable names”. You can find these, and copy and paste them into the formula box if you open up the metadata of your dataset, shown in red below

[Click to Enlarge]

• Family Type: