The challenge of keeping TSR running on the busiest day of the year
The Olympic Stadium is filled to the rafters. There's not a seat to spare at Wembley. Twickenham is completely packed and Old Trafford is fit to burst. Anfield, St James' Park and Stamford Bridge are all heaving. But still, there are thousands more people trying to get in.
Imagine all those arenas crammed to capacity and you start to get an idea of the numbers that flocked to The Student Room on A-level results day last week.
It's undoubtedly the busiest day of the year on TSR. This year, we saw a surge of more than half a million visits, as people joined us for help and support from our experts and their peers.
Dealing with such volume brings all kinds of challenges. Before results day, our dedicated members, moderators and in-house staff ensure our help features and forum threads are bang up-to-date. On the day, a panel of expert advisers staff the forums to provide minute-by-minute support for students.
But it's behind the scenes where the most crucial work is done, as TSR's crack team of web developers and techies ensure the site doesn't wilt under the pressure.
For them, results day started back in February. It was then that the focus turned towards improving the speed, efficiency and resilience of the site, to ensure it could cope with an influx of hundreds of thousands of people in a single day.
Come A-level results day and the dev team had ensured TSR could sail through anything short of an apocalypse. Its servers were protected from power cuts, internet outage and individual component failure. Stress tests ensured it could cope with more than double the amount of expected traffic, while a carbon copy of the site - placed on the other side of the world - would keep TSR running even if its original servers were wiped from the face of the earth.
The hard work paid off in spades. Pete Kelly, TSR's technical director and the man who led the project, says that all the work ensured results day "couldn't have gone more smoothly".
Behind the scenes
A key factor in improving the resilience of TSR has been moving the site from a physical environment to a virtual one. Four powerful physical boxes run a network of 20 virtual servers, enabling a flexible structure that protects against component failure. If one element of hardware fails, the remainder can take on its share of the work without the loss of any functionality.
Ahead of results day itself, that hardware was boosted with the addition of another two physical boxes, giving additional fail-safe options as well as offering further capacity were it to be required.
Home for those servers is Telehouse in London - a state-of-the-art facility located on the internet backbone that offers a range of protective measures, such as multiple power sources and multiple internet connections. If a pylon topples over or a digger cuts through a fibre optic cable, TSR’s servers won’t be affected.
The set-up was mirrored in Vancouver, where an identical array of hardware and virtual servers replicated the TSR databases in real-time.
Should aliens invade London and make off with the TSR servers, the Vancouver mirror would seamlessly take over the running of the site. Anyone browsing the site would not even notice the difference (aside from the odd hysterical message-board post from London, perhaps).
This was amply demonstrated a few days before results day. On the Monday and Tuesday nights of last week, the site switched over to the Vancouver set-up for a few minutes, just to check it all worked OK. Everything ran without a hitch.
Away from hardware, the site itself has been busily worked upon. External consultants have provided advice on its architecture and the efficiency of the core code was improved to ensure TSR pages could load even more quickly. TSR has become leaner and quicker. In the run-up to results day, it was relentlessly stress-tested. We simulated the arrival of 6,000 concurrent users in five minute bursts – the equivalent of 72,000 users coming to the site in one hour. That's around double what we would actually expect, and during the test the site didn’t wobble.
As a result, TSR ran smoothly and without incident throughout a remarkably busy results day. It was certainly put to the test. There were an average of 7,000 page requests every minute on TSR (peaking at just under 10,000 at 11:30).
“That means our servers processed and served out over nine million pages over the day,” says Pete. “Even with this amount of traffic, our average response times have remained under 150 milliseconds.
“Thanks to the tireless work of TSR’s technical team, our site could handle the strain of hundreds of thousands of visits, enabling us to once again provide crucial help and support to the UK’s A-level students – just when they need it most.”
Do you know an inspirational student? Are they one in a million? Tell us their story.
Show us your pics! We want to see your best photos from results day.