x Turn on thread page Beta
 You are Here: Home >< Maths

Real-world Statistics Problem: Estimating variance from a set of samples. watch

1. Hi All. Have been thinking about the best way to approach a specific example of this problem for a little while and decided it would be fun to generalise and present as a puzzle. I'm a big fan of problems where the tools required are not specified, only the outcome, and this falls into that category.

We are studying players of some game, where success is measured in points scored, and want to know how reliably we can predict their ability from their previous results. We assume that players' ability does not change, i.e. that they have a static expectation that they will score E points per game. The score in an individual game may be negative, and for each player it may be assumed that the score can be modelled as a normally distributed random variable:
S~N(E, σ2)
We assume that σ is the same for all players, however E is not.

Each player records their monthly results, noting down games played G, total score T, and their average score per game for that month (which can be trivially calculated from the other two).

Is it possible to estimate σ ? How would one go about doing so?

Using our estimate of σ , or otherwise, can we find a way some way of expressing our confidence in a player's ability? e.g. "Once a player has played 1000 games with an average score of X, we can be 95% sure that X is within 10 points/game of their expectation E"
2. (Original post by Followthe)
Hi All. Have been thinking about the best way to approach a specific example of this problem for a little while and decided it would be fun to generalise and present as a puzzle. I'm a big fan of problems where the tools required are not specified, only the outcome, and this falls into that category.

We are studying players of some game, where success is measured in points scored, and want to know how reliably we can predict their ability from their previous results. We assume that players' ability does not change, i.e. that they have a static expectation that they will score E points per game. The score in an individual game may be negative, and for each player it may be assumed that the score can be modelled as a normally distributed random variable:
S~N(E, σ2)
We assume that σ is the same for all players, however E is not.

Each player records their monthly results, noting down games played G, total score T, and their average score per game for that month (which can be trivially calculated from the other two).

Is it possible to estimate σ ? How would one go about doing so?

Using our estimate of σ , or otherwise, can we find a way some way of expressing our confidence in a player's ability? e.g. "Once a player has played 1000 games with an average score of X, we can be 95% sure that X is within 10 points/game of their expectation E"
I'm not too great at statistics but if you're attempting to find the estimated standard deviation for a confidence interval, surely you would require at least one example for any player; games played, their total score, their probability of achieving that score and their individual sample mean E because generally there would be too many unknowns in order to get an estimate of any kind?
3. We have multiple examples for each player in the following format (due to their monthly results recording):3467 games 59 points/game2147 games 67 points/game6777 games 74 points/game1225 games 61 points/gamebut there's no way to know their "true" ability, in fact the entire purpose of the exercise is to try to use the data we have to estimate how confident we can be of someone's "true" ability (E), after they've played N games.
4. (Original post by Followthe)
for each player it may be assumed that the score can be modelled as a normally distributed random variable: S~N(E, σ2)

We assume that σ is the same for all players, however E is not.
Are you assuming that each score is independent of any other? The problem becomes much harder if you don't (essentially you would have to use techniques from time series analysis to do it), but it would be much more realistic if you don't!

Each player records their monthly results, noting down games played G, total score T, and their average score per game for that month (which can be trivially calculated from the other two). Is it possible to estimate σ ? How would one go about doing so?
For estimating sigma, this is a very poor design! sigma represents the spread of individual scores around the mean, but in the design, scores are grouped up into mean scores per month. So, if you have a series of monthly scores, then you are essentially going to have to use the formula (where n varies) to get your estimate of sigma. Do-able, but tricky, and the estimate of sigma will be the more imprecise the larger the values of G.

Using our estimate of σ , or otherwise, can we find a way some way of expressing our confidence in a player's ability? e.g. "Once a player has played 1000 games with an average score of X, we can be 95% sure that X is within 10 points/game of their expectation E"
This would be the easy bit, either using confidence intervals in classical inference or credible intervals in Bayesian analysis.

TSR Support Team

We have a brilliant team of more than 60 Support Team members looking after discussions on The Student Room, helping to make it a fun, safe and useful place to hang out.

This forum is supported by:
Updated: August 10, 2016
Today on TSR

How much will your degree earn you?

Find out where yours ranks...

Poll
Useful resources

Maths Forum posting guidelines

Not sure where to post? Read the updated guidelines here

How to use LaTex

Writing equations the easy way

Study habits of A* students

Top tips from students who have already aced their exams

Chat with other maths applicants