MathMeister
Badges: 10
Rep:
? You'll earn badges for being active around the site. Rep gems come when your posts are rated by other community members.
#1
Report Thread starter 7 years ago
#1
Hello
Pearson's correlation coefficient between two variables is defined as the c
ovariance of the two variables divided by the product of their standard deviations.
Why are Sxx and Syy different from the form of standard deviation?
I assume Syy is the same as Sxx just you act as if the y axis is the x axis and do SD normally...
Please help...
0
reply
ztibor
Badges: 10
Rep:
? You'll earn badges for being active around the site. Rep gems come when your posts are rated by other community members.
#2
Report 7 years ago
#2
(Original post by MathMeister)
Hello
Pearson's correlation coefficient between two variables is defined as the c
ovariance of the two variables divided by the product of their standard deviations.
Why are Sxx and Syy different from the form of standard deviation?
I assume Syy is the same as Sxx just you act as if the y axis is the x axis and do SD normally...
Please help...
When the deviation is counted from a sample with sample mean and/or variance
then this known as sample variance and deviation (Sxx Syy) against with the standard variance and deviation calculated from the continuous function of probability variable.
So the Sxx or Syy is only estimated standard deviation, that is they are estimators.
This estimation will be unbiased when calculating variance we divide by (n-1) and
not by n as at the standard variance.

\displaystyle S_x=\sqrt{\frac{1}{n-1}\left (x_i-\bar x\right )^2}

\displaystyle \sigma_x=\sqrt{\frac{1}{n}\left (x_i-\bar x\right )^2}
0
reply
ghostwalker
Badges: 17
? You'll earn badges for being active around the site. Rep gems come when your posts are rated by other community members.
#3
Report 7 years ago
#3
(Original post by MathMeister)
Why are Sxx and Syy different from the form of standard deviation?
You should be aware that there exists S_{xx} and s_{xx}.

The latter form (small s) is the variance, and the first form (large S) is n times that.

S_{xx}=ns_{xx}

Does that cover it? If not can you elaborate on your question, as I won't have understood what you're getting at.
0
reply
MathMeister
Badges: 10
Rep:
? You'll earn badges for being active around the site. Rep gems come when your posts are rated by other community members.
#4
Report Thread starter 7 years ago
#4
(Original post by ghostwalker)
...
Thank you.
What I understand is that the SD is an estimator of the deviation from the mean of a set of data points i.e how spread out from the mean they are...which is lest robust/ easier to use than the MAD.
I know that the PMCC measures the magnitude and direction of correlation.
I see that do determine how close together the variables (lets say x and y) are, you would use standard deviation to see how spread out they are (and therefore how correlated/ close together the line is) . You'd do this for x and y as you need to how spread out they are from both sides.
Please may you tell me whether this is true- and if so- explain why the equations for Sxx and Syy are not similar looking to the SD equation.
And please explain what the covariance is please.
0
reply
ghostwalker
Badges: 17
? You'll earn badges for being active around the site. Rep gems come when your posts are rated by other community members.
#5
Report 7 years ago
#5
(Original post by MathMeister)
Thank you.
What I understand is that the SD is an estimator of the deviation from the mean of a set of data points i.e how spread out from the mean they are...which is lest robust/ easier to use than the MAD.
I know that the PMCC measures the magnitude and direction of correlation.
I see that do determine how close together the variables (lets say x and y) are, you would use standard deviation to see how spread out they are (and therefore how correlated/ close together the line is) . You'd do this for x and y as you need to how spread out they are from both sides.
The standard deviation of x (or y), is a measure of the spread of x (or y). They tell you nothing about the regression line, and are really just scaliing factors so that |PMCC| <= 1

Please may you tell me whether this is true- and if so- explain why the equations for Sxx and Syy are not similar looking to the SD equation.
I thought I covered this in my previous post.

And please explain what the covariance is please.
I quote directly from wikipedia (explains it better than I can):

"In probability theory and statistics, covariance is a measure of how much two random variables change together. If the greater values of one variable mainly correspond with the greater values of the other variable, and the same holds for the smaller values, i.e., the variables tend to show similar behavior, the covariance is positive. In the opposite case, when the greater values of one variable mainly correspond to the smaller values of the other, i.e., the variables tend to show opposite behavior, the covariance is negative. The sign of the covariance therefore shows the tendency in the linear relationship between the variables. The magnitude of the covariance is not easy to interpret. The normalized version of the covariance, the correlation coefficient, however, shows by its magnitude the strength of the linear relation."
0
reply
davros
Badges: 16
Rep:
? You'll earn badges for being active around the site. Rep gems come when your posts are rated by other community members.
#6
Report 7 years ago
#6
(Original post by MathMeister)
Thank you.
What I understand is that the SD is an estimator of the deviation from the mean of a set of data points i.e how spread out from the mean they are...which is lest robust/ easier to use than the MAD.
It's not an "estimator" of the deviation, it is the average deviation - or at least one possible measure of it.

Both SD and MAD are possible measures of average deviation from the mean, and I imagine in principle one could come up with a more complicated measure of deviation. But it's a mistake to think that there is one "true" deviation and everything else is an estimator of it. SD and MAD are possible measures of spread, just as mean, median and mode are possible candidates for an "average" i.e. typical value of a set of data.

The SD is usually easier to manipulate from a calculus point of view, although non-mathematicians (e.g. social scientists) would probably argue that MAD is simpler to calculate. So "easier" is subjective. Also not sure what you mean by "less robust" - this isn't really a mathematical term
0
reply
MathMeister
Badges: 10
Rep:
? You'll earn badges for being active around the site. Rep gems come when your posts are rated by other community members.
#7
Report Thread starter 7 years ago
#7
(Original post by davros)
...
Does the covariance measure how close together the line is? How strong the correlation is...
0
reply
davros
Badges: 16
Rep:
? You'll earn badges for being active around the site. Rep gems come when your posts are rated by other community members.
#8
Report 7 years ago
#8
(Original post by MathMeister)
Does the covariance measure how close together the line is? How strong the correlation is...
See ghostwalker's quote from wikipedia - you need a normalized version to measure the strength of linear relationship
0
reply
MathMeister
Badges: 10
Rep:
? You'll earn badges for being active around the site. Rep gems come when your posts are rated by other community members.
#9
Report Thread starter 7 years ago
#9
(Original post by davros)
See ghostwalker's quote from wikipedia - you need a normalized version to measure the strength of linear relationship
That is what the PMCC does though...
0
reply
X

Quick Reply

Attached files
Write a reply...
Reply
new posts
Back
to top
Latest
My Feed

See more of what you like on
The Student Room

You can personalise what you see on TSR. Tell us a little about yourself to get started.

Personalise

Has advance information helped during your exams?

Yes (67)
66.34%
No (26)
25.74%
I didn't use it to prepare (8)
7.92%

Watched Threads

View All