The Student Room Group

Further stat questions for correlation and spearmans

Screenshot 2024-06-06 003832.png

I'm just wondering what this actually is asking for me to know. It says interept scatter diagrams so what does this actually mean? Does it just want me to be know how scatter diagrams would look based on r values?

It also asks knowing the difference between association and correlation but how am i meant to know? I'm pretty sure they're both the same thing.
Screenshot 2024-06-06 004232.png
Screenshot 2024-06-06 004238.png

I don't even know what this question is even asking for so can someone please explain.
Reply 1
Original post by 43Explosion
Screenshot 2024-06-06 003832.png
I'm just wondering what this actually is asking for me to know. It says interept scatter diagrams so what does this actually mean? Does it just want me to be know how scatter diagrams would look based on r values?
It also asks knowing the difference between association and correlation but how am i meant to know? I'm pretty sure they're both the same thing.
Screenshot 2024-06-06 004232.png
Screenshot 2024-06-06 004238.png
I don't even know what this question is even asking for so can someone please explain.

Cant say for certain exactly how theyre defined for your board, but association is a more general concept than linear correlation. So an association could indicate a nonlinear trend and the R^2 value could be zero as its based on linear correlation. Some examples
https://www.statology.org/correlation-vs-association/

Im presuming there is a ^2 missing on the right for the second question? If so, the sum of squared residuals measures how much a line ax+b differs from the data points xi,yi. The square means the measure for each data point is positive and you sum over the data. The aim with regression is to estimate the parameters a,b in order to miniimise this quantity, so best predict the data measurements yi given xi. You can imagine it as a variance.
Reply 2
Original post by mqb2766
Cant say for certain exactly how theyre defined for your board, but association is a more general concept than linear correlation. So an association could indicate a nonlinear trend and the R^2 value could be zero as its based on linear correlation. Some examples
https://www.statology.org/correlation-vs-association/
Im presuming there is a ^2 missing on the right for the second question? If so, the sum of squared residuals measures how much a line ax+b differs from the data points xi,yi. The square means the measure for each data point is positive and you sum over the data. The aim with regression is to estimate the parameters a,b in order to miniimise this quantity, so best predict the data measurements yi given xi. You can imagine it as a variance.

Ok i kinda get the difference between association and correlation now. Correlation is for linear relationships and association is for non linear relationships to sum it up. That link helped a lot.

My bad for cropping it badly. There is an a^2 missing on the equation.

If i understand correctly the equation represents the distance from all the points to the line of best fit and the aim is to get equation to get the smallest value possible?
Reply 3
Original post by 43Explosion
Ok i kinda get the difference between association and correlation now. Correlation is for linear relationships and association is for non linear relationships to sum it up. That link helped a lot.
My bad for cropping it badly. There is an a^2 missing on the equation.
If i understand correctly the equation represents the distance from all the points to the line of best fit and the aim is to get equation to get the smallest value possible?

Pretty much for both points. For the second one, yi-(axi+b) is the residual or error between the measurement and the prediction. You want this to be as small as possible (noting that it is typically nonzero as there is noise in the measurements). So square it up to make it positive and sum over all data points gives you a measure of how well a particular line fits the data. The regression line corresponds to the parameter values a,b which minimise this quanity.
Reply 4
Original post by mqb2766
Pretty much for both points. For the second one, yi-(axi+b) is the residual or error between the measurement and the prediction. You want this to be as small as possible (noting that it is typically nonzero as there is noise in the measurements). So square it up to make it positive and sum over all data points gives you a measure of how well a particular line fits the data. The regression line corresponds to the parameter values a,b which minimise this quanity.

I fully got it now. I appreciate the help 👍️

Quick Reply