1. As you all may know, the exam is in 4 days. :x

I was doing this past paper: GCE June 2011 6PH08 (6b)

In the very first question part (b) there is a calculation for the volume. We are supposed to average the results already in the table. When I did it (for practice) I considered the values 28.2 and 5.94 as anomalies as they seemed to be a bit distant from the other values. It seemed logical.
However in the marking scheme this is not considered. All five values are taken. When I checked the examiner's report, it was mentioned about students who used just 4 values and cut out those two as anomalies, not using it in the calculation, but using them in part (c) for an uncertainty calculation. But it is not mentioned whether this is wrong or right, but the examiner has given comments saying "student has done well."

I have two questions:

> What are the criteria for considering a value as an anomaly and exclude in averaging, in the edexcel unit 6 paper?

> When you consider something an anomaly, is it taken in the half-range/average uncertainty calculation? Is it the correct thing to do?

Much thanks. And good luck to everyone doing the exam!
Well I think you've jumped the gun in calling anomaly on the width, though tbh I don't know what the edexcel syallabus has to say about it, it's a bit of a running joke at universities how quickly first year students start crossing out data they shouldn't and calling it anomalies.

There's a lot of methods for picking out probable anomalies / outliers and I'll show you a decent but simple one called Tukey's test... that doesn't require a lot of statistics knowledge, it works by comparing data values to the amount of 'spread' in the compete set of data... this is FYI, I'm don't know whether there's any marks in it at A level but it might give you a feel about how far away from the centre you you can get before you start needing to worry about anomalies and whether you need to think about rejecting them.

Tukey's relies on calculating the interquartile range of your data... so first you sort the data into ascending order like this...

WIDTH (mm)
28.2 (min)
28.9 (Q1)
29.0 (Q2 || median)
29.1 (Q3)
29.3 (max)

IQR=(29.1-28.9)
=0.2

Tukey's says we should be suspicious of values that are less than Q1 - 1.5 IQR or more than Q3 + 1.5 IQR and very suspicious of values that are less than Q1 - 3 IQR or more than Q3 + 3 IQR - which are 'far ou't values distant from the centre

so we are suspicious of data less than 28.6 or greater than 29.4 and very suspicious of data less than 28.3 or greater than 29.7

therefore 28.2 is highly suspect, but at the other extreme the value 29.3 is perfectly plausible.

-----
THICKNESS (mm)
5.94 (min)
5.97 (Q1)
5.99 (Q2 || median)
6.01 (Q3)
6.04 (max)

IQR = 6.01-5.97
=0.04

Q1-1.5 IQR = 5.91
Q3 + 1.5 IQR = 6.07

so we don't have to worry about any of the WIDTH data at all - there doesn't HAVE to be any anomalies.
3. right.. I remember studying something like this for S1.
Thanks!

