ANOVA(Analysis of Variance) is a framework that forms the basis for tests of significance & provides knowledge about the levels of variability within a regression model. It is the same as Linear Regression but one of the major differences is Regression is used to predict a continuous outcome on the basis of one or more continuous predictor variables. Whereas, ANOVA is used to predict a continuous outcome on the basis of one or more categorical predictor variables.
When implementing Linear Regression we often come around jargon such as SST(Sum of Squared Total), SSR(Sum of Squared Regression), SSE(Sum of Squared Error), and wonder what do they actually mean? In this post, we will be covering these topics and also implement an example to have a better & firm understanding of the subject.
SST(Sum of Squared Total)
Sum of Squared Total is the squared differences between the observed dependent variable and its average value(mean). One important note to be observed here is that we always compare our linear regression best fit line to the mean(denoted as y ̅ ) of the dependent variable slope.
SSR(Sum of Squared Regression)
The Sum of Squared regression is the sum of the differences between the predicted value and the mean of the dependent variable.
SSE(Sum of Squared Error)
The Sum of Squared Error is the difference between the observed value and the predicted value.
To understand the flow of how these sum of squares are used, let us go through an example of simple linear regression manually. Suppose John is a waiter at Hotel California and he has the total bill of an…