Wednesday, November 2, 2011

Singapore whiteboard: Difference between correlation and regression

Great little tutorial question. What is the difference between correlation and regression?

Here's my go at it:

First the similarities. Correlation and regression are both measures of association, where two variables might move together for respondents. For example people who register high satisfaction with a restaurant might also measure high likelihood to recommend. The the tightness of this relationship is measured and reported by a Pearson correlation coefficient (r).

These two variables might be plotted as a data point on a pair of axes. See below:


But the correlation coefficient coefficient says nothing about the nature of the relationship between the two variables, other than how close the data points are to a line of best fit.

A bivariate linear regression offers essentially the same piece of information (expressed as an r-sq) but also tells us a little more. In the diagram above, the r for both sets of data are identical, as would the r-sq from the respective bivariate regressions.

The line of best fit, drawn by the bvariate regression also gives us the slope of the line, and the intercept. That is how the two datasets above would differ. -if I was answering an exam q I'd go on to say what the slopes and intercepts would be for each dataset.

btw, tell me if what I've written is complete garbage. I think I'm right with it, but I'm often surprised.


- Posted using BlogPress from my iPhone

No comments:

Post a Comment