anova
Without a stats backgorund, I'm having a hard time parsing what people are referring to when they say "ANOVA"? Is it a particular technique with associated algorithm? Does it refer to a broad family of methods?
Here's what I think so far:
- people usually mean F-test
- ANOVA is usually used for categorical variables, and the result of performing an ANOVA is a table. In the table you have estimated means for each category and you compare with the null that categories have the same mean.
- if you have continuous variables, the ANOVA you do is a linear regression. You can think of each point on the continuum of the independent variable as a "category", and you are estimating a conditional mean for each category. The null that you are comparing to is the hypothesis that all these conditional means are the same.
1. some intuitions
- if you really thought that there was a strong difference between two groups, then the variance within each group should be small, and the variance between the groups should be big. But if the two groups are really the same population, then those variances should be the same.
2. helpful links
- difference between anova and regression – "The widespread use of ANOVA is partly a relic of statistical analysis before the use of more powerful statistical software"
- When should someone use 2 one-way anovas over 1 two-way anova? – "An ANOVA is just a special case of a linear regression"