# Statistics - measure of similarity

Watch
Announcements
#1
What statistic/function can I use to measure the similarity of two sets of data? I basically want something like a correlation coefficient but that doesn't just take into account how well one predicts the other but also how similar they are. E.g. if I have two sets of data, one defined by and one defined by , the correlation coefficient for these two datasets will be 1. I'd like a statistic that would return a 1 iff the two data sets are equivalent i.e. and .

Specifically, I have a model that predicts how likely an individual is to be accepted into a college based on their A-levels, and groups them into categories based on this likelihood. I also have a sample of individuals and know whether or not those individuals got accepted or not. I've used this data to produce an actual accept rate for each likelihood category, so my data looks something like:

Modelled likelihood | Actual accept rate for that likelihood
0% | 2%
10% | 12%
20% | 18.5%
30% | 34%
40% | 41%
50% | 49%
60% | 60%
70% | 71%
80% | 74%
90% | 87%
100% | 93%

How can I measure how 'similar' these are? I imagine something similar to a standard deviation could be used, but where we don't look at the 'distance' between each data point and the average of data points but the distance between each data point and its corresponding data point in the second array...

Any help much appreciated, sorry if this is a stupid and/or actually complicated question
0
X

new posts
Back
to top
Latest
My Feed

### Oops, nobody has postedin the last few hours.

Why not re-start the conversation?

see more

### See more of what you like onThe Student Room

You can personalise what you see on TSR. Tell us a little about yourself to get started.

### Poll

Join the discussion

#### Current uni students - are you thinking of dropping out of university?

Yes, I'm seriously considering dropping out (147)
14.54%
I'm not sure (43)
4.25%
No, I'm going to stick it out for now (303)
29.97%
I have already dropped out (26)
2.57%
I'm not a current university student (492)
48.66%