Regression

Introduction

Regression analysis helps understand how the dependent variable changes when any of the independent variables is varied, while any other independent variables are held fixed.

Inputs

[Click to Enlarge]

Regression Name: Specify a name for your regression – if you are intending on undertaking more than one, a unique name relating to the variables being processed is advised

Regression Dataset Input: The dataset containing the variables to be tested.

Regression Dependent Variable: The response (dependent) variable to be used in the regression.

Regression Independent Variable(s): The explanatory (independent) variable(s) to be used in the regression.

Outputs

Output includes regression coefficients and correlation coefficients.

Regression Coefficients:

• Intercept
• Estimate(s)
• Std. Error
• t value (the estimated value divided by its estimated standard error)
• Pr(>|t|) (the probability for testing the hypothesis)
• sigma (standard deviation)
• r.squared : R2 Co-efficient of determination (the amount of variation in the dependent variable explained by variation in the independent variables)
• fstatistic : F (the ratio of two measures of variability)
• fstatistic : DFR (degrees of freedom for regression )
• fstatistic : DFE (degree of freedom for error)

Correlation Coefficients: a measurement of how two variables are related.

The following formula is used in the implementation of linear regression.

$$ŷ = a + b_i x_i$$

where:
$$y$$ is dependent variables
$$x$$ are the independent variable(s)
$$ŷ$$ is the vector of fitted values
$$a$$ is the y intercept
$$b$$ is the estimate(s) of the slope

The residual vector is $$y-ŷ$$.

References

1. Buechler, S.(2007) Statistical Models in R: Some Examples, Department of Mathematics, University of Notre Dame.
2. Ferrari, D. and T. Head (2010) Regression in R Part I: Simple Liner Regression, Statistical Consulting Center, UCLA Department of Statistics.
3. http://en.wikipedia.org/wiki/Residual_sum_of_squares.