The Student Room Group

How reliable are grades?

I saw a recent blog which seemed to be saying something about grades being unreliable. Is it true that Ofqual have stated that grades are reliable only to one grade either way? And if it is true, can a place be claimed if the grade as awarded is one grade under an offer? Does anyone know?
(edited 1 year ago)

Scroll to see replies

Original post by grade guardian
I saw a recent blog which seemed to be saying something about grades being unreliable. Is it true that Ofqual have stated that grades are reliable only to one grade either way? And if it is true, can a place be claimed if the grade as awarded is one grade under an offer? Does anyone know?

Given that the difference between one grade and the next could be a single mark on a single paper, it would be accurate to say that someone with a grade A could be been 1 mark away from a B, or one mark away from an A* - and there's no way to tell which. I suspect that this is what Ofqual were alluding to.

As to the second part of your post, that's a simple "No". You can't "claim" a place unless to meet that actual offer you've been made. However, you might find that the university opts to confirm your place if you've just missed the offer - if they have space available.
Thank you, that's very helpful. But I'm still rather puzzled. That blog links to this one, which shows this chart from an Ofqual report, Marking Consistency Metrics - An update, published in November 2018. The chart is very complicated, Could you explain what it means, please?Screenshot 2023-04-19 at 20.35.55.png
Original post by DataVenia
Given that the difference between one grade and the next could be a single mark on a single paper, it would be accurate to say that someone with a grade A could be been 1 mark away from a B, or one mark away from an A* - and there's no way to tell which. I suspect that this is what Ofqual were alluding to.

As to the second part of your post, that's a simple "No". You can't "claim" a place unless to meet that actual offer you've been made. However, you might find that the university opts to confirm your place if you've just missed the offer - if they have space available.
Original post by grade guardian
Thank you, that's very helpful. But I'm still rather puzzled. That blog links to this one, which shows this chart from an Ofqual report, Marking Consistency Metrics - An update, published in November 2018. The chart is very complicated, Could you explain what it means, please?Screenshot 2023-04-19 at 20.35.55.png

I haven't read the article from which this figure has been extracted, so I'm not going to attempt to explain the boxes, whiskers and outliers on the box plot. Instead I'm going to focus on the key piece of data which is the heavy vertical line within each blue box.

Suppose two examiners mark a given paper, and one examiner awards is 65 marks but another awards it 67 marks. That's perfectly possible, given that there's often some subjectivity in the mark scheme. Which mark is the "correct" mark, and therefore the "correct" grade?

Ofqual define what that call the "definitive" grade as being that which would be awarded by a senior examiner. They've then compared this grade with that awarded by an "ordinary" examiner to establish how likely the ordinary examiner is to award the definitive grade.

For subjects like Maths, where there's no much subjectivity (and answer is typically either right or wrong) then it's very likely that the ordinary examiner will award the definitive grade. About 95% likely in fact, looking at the figure.

For subjects like English Language, where there's less in the way if right/wrong answers, it's just over 60% likely that an ordinary examiner will award the definitive grade.

Both of these numbers can be read from the thick vertical line within the dark blue boxes. The light blue boxes answer the same question but at a component level. A subject grade is more likely to be accurate than a component grade (as two components, over marked and one under marked, will tend to even-out across a subject).

If you want an explanation of the inter-quartile range expressed by the boxes themselves, the whiskers (the lines extending from them) and the outliers (the dots) then I'll need to actually read the article. Essentially, they're an indication of the range and statistical confidence within the data.

Does that answer your question?
Thank you so much! That is very clear. But it does raise a question. Taking the example of English that you highlight, if only 60% of scripts are given the 'definitive' grade, that seems to imply that 40% of scripts are given a grade which is 'non-definitive'. Does that mean that about 40% of English grades are - to use a simpler word - wrong? History looks to be even worse, and most of the other subjects aren't too good either...
Original post by grade guardian
Thank you so much! That is very clear. But it does raise a question. Taking the example of English that you highlight, if only 60% of scripts are given the 'definitive' grade, that seems to imply that 40% of scripts are given a grade which is 'non-definitive'. Does that mean that about 40% of English grades are - to use a simpler word - wrong? History looks to be even worse, and most of the other subjects aren't too good either...

That would be my interpretation, yes. Note that the grade may be "wrong" in either direction, so (unless there is a skew in the data) then just as many students will benefit from this as suffer as a result of it. Perhaps not much comfort, though.

It would be easy to argue that universities would benefit from seeing the actual marks* a student obtained (not just the grade) so they could tell the difference between a student who just missed their offer grade from one who was well below it. Alas, this is not how the system works.

* The marks would need to be standardised to allow for differences between exam boards, and between different cohorts. This concept is normally called Uniform Mark Scale (UMS).
Wow! My thanks again, very clear.

If I've understood correctly, for English, about 60% of students are awarded grades that are "definitive"/"right", about 20% are awarded "non-definitive"/"wrong" grades higher than the student truly merits, and about 20% are awarded "non-definitive"/"wrong" grades lower than the student truly merits.

So for a cohort of about 700,000 GCSE students for the exams about to happen, that means about 420,000 students will get the grade they actually deserve, about 140,000 will get a grade too high, and about 140,000 will get a grade too low.

140,000 students being given a grade lower than they deserve seems to me to be a big deal - and that's just English...

As you say, that's not much comfort... Oh dear...
This is such an interesting thread. I am preparing for my A-Levels and how can I have any comfort, if I’m one or two grades below what’s required for my universities, that had someone else marked my papers, I would’ve been awarded a different grade?

Exams are a fundamental public service and considering that Ofqual are supposed to provide a ‘reliable indication’ of a student’s knowledge and skills, I’m wondering if they’re not fulfilling their purpose? Thousands of students are being failed each exam series and many people seem oblivious to this fact.

However, I researched this further and while many are oblivious to this, that doesn’t include Ofqual. Dame Glenys Stacey, who led Ofqual briefly, recognised that grades are reliable “to one grade either way”, and someone else remarked that this applies to 96% of GCSEs and 98% of A-Levels, implying that 4% of GCSE grades and 2% of A-Level grades are reliable to more than other grade either way? That’s extremely unreliable, when exam results, sadly but truthfully, relied upon so heavily for universities and employers.

I am sitting History A-Level and evidently it would be better to roll a dice to determine my grade. When I was doing my GCSEs under the Teacher-Assessed Grades system, I first heard about grade unreliability and remember wondering how on earth teachers would be able to determine what a Grade 5 or 6 looks like when there really is no set description. Surely Ofqual can do better but more students, parents and teachers need to be made aware of this failure.
Original post by 2023SummerGrades
This is such an interesting thread. I am preparing for my A-Levels and how can I have any comfort, if I’m one or two grades below what’s required for my universities, that had someone else marked my papers, I would’ve been awarded a different grade?

This is one reason why all exam boards allow for a "Review of marking" (or, if a university place is pending, a "Priority review of marking") to be requested.

From AQA's description of the service, here:

"If you request a review or priority review of marking:
it includes a clerical re-check
you’ll receive a copy of the reviewed script as part of this service
a second examiner will review the paper/recording again to identify genuine marking errors or unreasonable marking
we’ll make sure all the marks are counted
a grade can go down as well as up."

The last bullet point means that it's only sensible to do if you're close to the grade boundary above.

I haven't seen the stats, but a paper which has been through that process should be significantly less likely to have the "wrong" grade assigned.
Original post by grade guardian
I saw a recent blog which seemed to be saying something about grades being unreliable. Is it true that Ofqual have stated that grades are reliable only to one grade either way? And if it is true, can a place be claimed if the grade as awarded is one grade under an offer? Does anyone know?


Hello, @grade guardian
I hope all is well.

Personally, I feel that grades are only ONE PART of your success. When speaking to my advisor at university during my postgraduate degree, I was told that your grades are only one part of getting a job. The rest accumulates the following;

- Your personality during the interview
- your adaptability
- your work experience
- any volunteer work
- how consistent and reliable you have been
- good timing / patience etc
- How much people feel they would work well with you as a team member

there is so much that people consider aswell as your grades.

For instance someone could have a distinction (top marks) but not have the adaptability or personality that quite fits within their team. However the next person could have a merit (middle marks) but their desire to learn and progress including their ability to work well as a team member, may be enough to get the job, over the person who has the best marks.

I understand that this is a side note to your question, but I hope it inspires hope around grades. :smile:

all the best!

- Laura
Original post by UniofChester Rep
Hello, @grade guardian
I hope all is well.

Personally, I feel that grades are only ONE PART of your success. When speaking to my advisor at university during my postgraduate degree, I was told that your grades are only one part of getting a job. The rest accumulates the following;

- Your personality during the interview
- your adaptability
- your work experience
- any volunteer work
- how consistent and reliable you have been
- good timing / patience etc
- How much people feel they would work well with you as a team member

there is so much that people consider aswell as your grades.

For instance someone could have a distinction (top marks) but not have the adaptability or personality that quite fits within their team. However the next person could have a merit (middle marks) but their desire to learn and progress including their ability to work well as a team member, may be enough to get the job, over the person who has the best marks.

I understand that this is a side note to your question, but I hope it inspires hope around grades. :smile:

all the best!

- Laura

Given that @grade guardian asked, "can a place be claimed if the grade as awarded is one grade under an offer?" I suspect they're referring to a conditional offer they have received for entry to an undergraduate degree at university. As such, characteristics like "how consistent and reliable you have been" and "good timing / patience etc" make no difference - it's all about the grades.
Thank you, but I'm still troubled.

As you describe, a "review of marking'" is a clerical check to make sure there are no adding mistakes, and that each question's mark is compliant with the mark scheme, and not "unreasonable". But it's not a re-mark by a senior examiner.

That seems to be very important. In your earlier reply, you stated that the chart shows the difference between the grade resulting from an ordinary examiner's mark, and the "definitive" grade resulting from a senior examiner's mark. That makes sense only if the ordinary examiner's marks are fully "reasonable" and compliant with the mark scheme. This therefore shows that it is possible for the ordinary examiner's mark to be compliant, and totally "reasonable", but still resulting in a "non-definitive" grade. And for English, that seems to happen for 40% of scripts.

But because the marks underpinning the "non-definitive" grade are all compliant, any "review of marking" will confirm the original "non-definitive" grade. How can that be fair?
Original post by DataVenia
This is one reason why all exam boards allow for a "Review of marking" (or, if a university place is pending, a "Priority review of marking") to be requested.

From AQA's description of the service, here:

"If you request a review or priority review of marking:
it includes a clerical re-check
you’ll receive a copy of the reviewed script as part of this service
a second examiner will review the paper/recording again to identify genuine marking errors or unreasonable marking
we’ll make sure all the marks are counted
a grade can go down as well as up."

The last bullet point means that it's only sensible to do if you're close to the grade boundary above.

I haven't seen the stats, but a paper which has been through that process should be significantly less likely to have the "wrong" grade assigned.
Hi Laura - thank you - it's good to know that U Chester takes such a sensible view. My fear, however, is that other unis might not be so wise...
Original post by UniofChester Rep
Hello, @grade guardian
I hope all is well.

Personally, I feel that grades are only ONE PART of your success. When speaking to my advisor at university during my postgraduate degree, I was told that your grades are only one part of getting a job. The rest accumulates the following;

- Your personality during the interview
- your adaptability
- your work experience
- any volunteer work
- how consistent and reliable you have been
- good timing / patience etc
- How much people feel they would work well with you as a team member

there is so much that people consider aswell as your grades.

For instance someone could have a distinction (top marks) but not have the adaptability or personality that quite fits within their team. However the next person could have a merit (middle marks) but their desire to learn and progress including their ability to work well as a team member, may be enough to get the job, over the person who has the best marks.

I understand that this is a side note to your question, but I hope it inspires hope around grades. :smile:

all the best!

- Laura
Original post by grade guardian
Thank you, but I'm still troubled.

As you describe, a "review of marking'" is a clerical check to make sure there are no adding mistakes, and that each question's mark is compliant with the mark scheme, and not "unreasonable". But it's not a re-mark by a senior examiner.

That seems to be very important. In your earlier reply, you stated that the chart shows the difference between the grade resulting from an ordinary examiner's mark, and the "definitive" grade resulting from a senior examiner's mark. That makes sense only if the ordinary examiner's marks are fully "reasonable" and compliant with the mark scheme. This therefore shows that it is possible for the ordinary examiner's mark to be compliant, and totally "reasonable", but still resulting in a "non-definitive" grade. And for English, that seems to happen for 40% of scripts.

But because the marks underpinning the "non-definitive" grade are all compliant, any "review of marking" will confirm the original "non-definitive" grade. How can that be fair?

The text I quoted above was from the AQA web site. It seems that the different exam board take different approaches.

What Pearson/Edexcel say, here, is that "A senior examiner will review the original marking and change if it error in the application of the mark scheme were found."

OCR say here that, "OCR monitors reviewers throughout the review of marking and moderation period to ensure that they are making appropriate and consistent decisions and only changing marks to correct marking error. For reviews of marking, OCR monitors all reviewers daily throughout the review period by checking a sample of each reviewer's work. If we find a problem, the reviewer will be given feedback, re-trained or stopped from reviewing as necessary, and we will carry out the review in question again." So the reviews aren't being carried out by senior examiners, but the reviews themselves are clearly being monitored.

The system is clearly not perfect, and some candidates will end-up missing-out on their university place as a result. But it's the system we have.
Thanks once more - I really appreciate your diligence and thoroughness.

But as you describe things more fully, my concern only increases. And I must confess that I don't see that boards take different approaches - Pearson says that a senior examiner does the review, but they confirm that the original mark will be changed only if there is an "error", as defined by a failure to comply with the mark scheme; OCR also say "only changing marks to correct marking error".

None of this is what I think it should be - a professional second opinion, a re-mark by a senior examiner to ensure that the student is awarded the "definitive" grade. This confirms - as I feared - that it is possible that a script can be marked by an ordinary examiner such that there are no "marking errors", yet result in a "non-definitive" grade. With the sting-in-the-tail that this cannot be corrected by the "appeals" process.

If that is the case - as it appears to be - then about 280,000 GCSE English students will get the wrong grade this summer, with no right of appeal. OK, 140,000 are lucky, getting a grade higher. But 140,000 will get a grade too low, and be stuck. If that happens at the 4/3 boundary, that's a killer - how many students will lose a year and be forced to re-sit just because of this? And what about all the other subjects, and - as you say - the impact on uni entrance?

You are right that "the system is clearly not perfect". But surely, surely, it can be a lot better.
Your reference to AQA, Pearson/Edexcel and OCR got me to look around a bit, and I found this on the SQA website https://www.sqa.org.uk/sqa/79049.html

Screenshot 2023-04-21 at 08.20.59.png

I appreciate that the SQA applies only to Scotland, but their description of a "marking review" is very similar to the "reviews of marking" carried out by AQA, P/E and OCR, with "marking in line with national standards" being the Scottish equivalent of "no marking errors".

But there is one, big, exception. SQA explicitly say "This is not a re-mark".

AQA, P/E and OCR don't say that. I find that very misleading - it's almost as if they are hoping people won't notice that what most people call an "appeal" is not what you might expect: an opportunity for an expert second opinion as achieved by a re-mark by a senior examiner, so as to ensure the "definitive" grade.

As you say, this is 'the system'.

But 'the system' seems to me to be very rotten.

How can I protest? What does it take to get 'the system' fixed?

Surely I can't be the only person who thinks this is all hugely unfair?

Original post by DataVenia
The text I quoted above was from the AQA web site. It seems that the different exam board take different approaches.

What Pearson/Edexcel say, here, is that "A senior examiner will review the original marking and change if it error in the application of the mark scheme were found."

OCR say here that, "OCR monitors reviewers throughout the review of marking and moderation period to ensure that they are making appropriate and consistent decisions and only changing marks to correct marking error. For reviews of marking, OCR monitors all reviewers daily throughout the review period by checking a sample of each reviewer's work. If we find a problem, the reviewer will be given feedback, re-trained or stopped from reviewing as necessary, and we will carry out the review in question again." So the reviews aren't being carried out by senior examiners, but the reviews themselves are clearly being monitored.

The system is clearly not perfect, and some candidates will end-up missing-out on their university place as a result. But it's the system we have.
(edited 1 year ago)
Original post by grade guardian
Your reference to AQA, Pearson/Edexcel and OCR got me to look around a bit, and I found this on the SQA website https://www.sqa.org.uk/sqa/79049.html

Screenshot 2023-04-21 at 08.20.59.png

I appreciate that the SQA applies only to Scotland, but their description of a "marking review" is very similar to the "reviews of marking" carried out by AQA, P/E and OCR, with "marking in line with national standards" being the Scottish equivalent of "no marking errors".

But there is one, big, exception. SQA explicitly say "This is not a re-mark".

AQA, P/E and OCR don't say that. I find that very misleading - it's almost as if they are hoping people won't notice that what most people call an "appeal" is not what you might expect: an opportunity for an expert second opinion as achieved by a re-mark by a senior examiner, so as to ensure the "definitive" grade.

As you say, this is 'the system'.

But 'the system' seems to me to be very rotten.

How can I protest? What does it take to get 'the system' fixed?

Surely I can't be the only person who thinks this is all hugely unfair?

I'll focus on your central question: "What does it take to get 'the system' fixed?"

Ofqual have said that the the "definitive grade" is based on the mark awarded by a senior examiner. If we want more definitive grades, then we need more senior examiners. I don't know what it takes to turn an "ordinary" examiner into a "senior" examiner, but I suspect it's primarily time (i.e. experience) and money (in salary and training). We should probably also throw ability into the mix too, as I assume someone who's been an ordinary examiner for a set period of time doesn't necessarily become a senior examiner - they need to have an appropriate level of ability. So, those are the things you need to fix. Once every examiner is a senior examiner then every grade will be definitive - with no need for the the "review or marking" process at all. :wink:

With regards to your "How can I protest?" question, you could: write to your MP, create a petition (e.g. at https://petition.parliament.uk/), contact an interested news outlet, try to create a Twitter storm, stand outside Ofqual's offices with a placard and megaphone - the usual ways people protest about anything really. In fact, you could do all of the above. Do I think it make any difference? No, I do not. :frown:

You are not the only person who feels that this is "hugely unfair". However, most people just accept it as a flawed system and move on with their lives.
Oh dear!

How can a student rejected from a uni course for want of a single grade, or forced to re-sit GCSE English with a grade 3, just "move on with their lives" when that single grade, or that grade 3, were in fact "non-definitive", without right of appeal?

And thanks for the idea of starting a petition. That's really good.

I wonder what support it would get from others on TSR? Maybe this thread should be read more widely...
Original post by DataVenia
I'll focus on your central question: "What does it take to get 'the system' fixed?"

Ofqual have said that the the "definitive grade" is based on the mark awarded by a senior examiner. If we want more definitive grades, then we need more senior examiners. I don't know what it takes to turn an "ordinary" examiner into a "senior" examiner, but I suspect it's primarily time (i.e. experience) and money (in salary and training). We should probably also throw ability into the mix too, as I assume someone who's been an ordinary examiner for a set period of time doesn't necessarily become a senior examiner - they need to have an appropriate level of ability. So, those are the things you need to fix. Once every examiner is a senior examiner then every grade will be definitive - with no need for the the "review or marking" process at all. :wink:

With regards to your "How can I protest?" question, you could: write to your MP, create a petition (e.g. at https://petition.parliament.uk/), contact an interested news outlet, try to create a Twitter storm, stand outside Ofqual's offices with a placard and megaphone - the usual ways people protest about anything really. In fact, you could do all of the above. Do I think it make any difference? No, I do not. :frown:

You are not the only person who feels that this is "hugely unfair". However, most people just accept it as a flawed system and move on with their lives.
Original post by grade guardian
Oh dear!

How can a student rejected from a uni course for want of a single grade, or forced to re-sit GCSE English with a grade 3, just "move on with their lives" when that single grade, or that grade 3, were in fact "non-definitive", without right of appeal?

And thanks for the idea of starting a petition. That's really good.

I wonder what support it would get from others on TSR? Maybe this thread should be read more widely...

Note that there is a right of appeal, if you're not happy with the outcome of a "review or marking". See this PDF from the Joint Council for Qualifications (which has all the usual exam boards as members).

However, I suspect you're not going to like the way these are handled either. An appeal can be made if "a marking or moderation (or a review of marking/moderation) error has occurred", and - in the preliminary stage, "involves a consideration of the case by an awarding body officer who has not had any previous involvement with or personal interest in the matter. This preliminary stage will include consideration of the written submission from the appellant." (Note the lack of the magic words "senior examiner" in this text.) Following the preliminary stage there is potentially an appeal heading. It's all details in that PDF.

Having looked at AQA's appeals process (see here), they say "For appeals made on the grounds of an unreasonable application of the mark scheme, we’ll commission a review and report from a senior examiner who will be provided with your grounds of appeal."

You might want to check the other exam boards which don't use a senior examiner during the "review of marking" to see if they do so during an appeal.
Grades are generally in a full system fairly representative. There is always going to be some variation in marking, however the more papers you do it should average out.

The way most things are marked by examiners typically working to submit marks nowadays as papers are scanned electronically rather then papers sent by paper to examiner allows for markers to work question by question rather then script by script should mean variation over a paper is greatly reduced, as well as one exam forming one part of what will be 3 or 4 assessments in an A-level means the potential for marking variation naturally gets minimized. On top of this there is stringent moderation and sampling to ensure accuracy & nowadays they can easily do centre to centre statistical data monitoring and re-check examiners who might have been slightly stricter or generous well in advance of marks being released.

Grades are basically grouped averages so there is always going to be students who are towards the ends of grade distributions the alternative is GPA style system (which im in favor of, although this isn’t something many people really advocate for as far as i am aware).

In general I think when you look at the number of assessments in A-levels or GCSEs, a full GCSE or A-level results set is pretty much representative of what was achieved, the bigger problem is normalizing educational standards in schools (where there is huge variation in culture, resources & teaching quality) rather then marking.

Quick Reply

Latest