How Can We Help?
Introduction to Simple Linear Regression
Simple Regression Analysis : An introduction to Linear Regression
This is the first write-up on regression analysis and this topic covers an introduction to simple regression analysis. Simple regression analysis is a bivariate analysis i.e. it has two variables; a statistical technique to fight relationship between variables involving one independent variable and one defendant variable wherein the relationship between the two can be approximated by a straight line. To be more precise the mean response of any value of the regressor variable will be a straight line. This read shares an overview of the simple linear regression model in a nutshell, with an explanation of variables and terminologies within the equation. The note includes an understanding of least square estimation of the coefficients of the regression model
Least Square Method
The regression analysis is used to estimate causality (cause and effect) between the two variables but points to be noted that a high correlation between two variables does not mean causality. Example shrimps sold in San Francisco might be highly correlated with number of soaps sold in Ghana. This does not does not imply that sale of shrimps is causing the number of soaps sold in Ghana. In this case it could be that another factor is causing increase in sale of shrimps and number of soaps sold or in this hypothetical case used only for illustration purpose purely coincidental. Regression does not imply cause and effect unless the regression has a theoretical consideration outside the sample data analyzed.
Above a graph of bivariate data, wherein the independent variable is on the horizontal axis and the dependent variable is on the vertical axis. In the figure, simple linear regression model with model parameters and an error component. X is the Independent Variable or Regressor variable or Predictor Variable, Usually approx. of the functional relationship between the two variables. A simple linear equation has only one regressor.
Simple Linear Regression Model | The middle Road
The X variable is fixed, the dependent or response variable is a conditional probability determined by the random component. Since the random component is a normalized curve (with mean of zero and variance σ2), the response variable Y will be a conditional probability normally distributed random variable. This means that the mean response at any value of X is a straight line ßo + ß1 X.
A detailed video and in-depth module would be following up this read.
Understanding Least Square Estimation of the Coefficients of the Regression Model
Least Square Method
Understand how the coefficients of regression are calculated using the least square model. In simple regression analysis, only one regressor is involved but when there are multiple regressors, we call it multiple regression analysis. Using least square estimation method, regression coefficients can be estimated by minimizing the sum of squares of difference between the observed value and the straight line using the minimum variance concept. The below derivation is for a simple linear equation with multiple observations. The sample regression model denotes the letter i as subscript used in summation for integers 1,2….n depending on the number of observations. The sample regression model is different from population regression model with the introduction of i as a subscript.
Image Simple Linear Equation Diagram | The middle Road
Using sample regression model, on the left. i donates the number of observations; i= 1,2,3 …… n
Using least square estimation method, regression coefficients can be estimated by minimizing the sum of squares of difference between the observed value and the straight line using the minimum variance concept. The equation on the right will be differentiated with respect to the two coefficients and the first order of the differential equated to zero.
The differentiation would be done taking one coefficient parameter at a time. The below example is from the book Introduction to Linear Regression Analysis by Douglas Montgomery, Elizabeth Peck, G. Geoffrey Vining. However, the complete derivation is solved by The middle Road and written below. Predicted value is also the fitted value.
Using the above equation 1, we can find the fitted values of both the coefficients of regression. The fitted values of the coefficients of regression are donated with a cap in the equation. The values of Y (Dependent Variable or Response Variable) and X, the Independent/Predictor Variable or Regressor are the averages for the observations of variables in the sample regression model.
Refer to the side equations. Now solving for the other regression coefficient, differentiating and equating the equation to zero.
The derivation does not include summation sign for the ß1 to making the analysis more simple. The summations are included at the end of the equation.