E-Mail : info@ardnet.co.uk

Testing our disaster recovery strategy

07/02/2016

Websites run on servers and servers are much like your desktop computer, they might work fine for months, years perhaps but electronic parts inevitably fail from time to time.  On your desktop computer it’s sensible to keep regular backups but how do we backup your website on our servers?

All our web servers are kept in a state of the art UK based data centre at a constant temperature, behind four levels of security access with Multiple UPS and three diesel generators, fire mitigation, lightning protection and water detection systems and 24 / 7 / 365 staff on site.

All of our web servers run what’s called RAID monitoring, this means there are two physical hard disks sitting side by side in the server that continually synchronise each other so if the live drive suddenly crashes, the other drive in theory will start working immediately with absolutely no downtime.

But what happens if both drives fail at the same time, it’s probably a million to one occurrence but it’s still a chance, for example an electrical malfunction could damage both drives.  We don’t like chances, no matter how small so we have a number of strategies to back up every single website.

Early every morning a complete backup of each of our servers is taken and stored in another location to the server, we also have other backups of the important parts of every website stored at a third location so we can go back months to restore a site if it’s found to have a fault.  These backups are copied and stored securely at a forth location weekly.

What happens when it goes wrong?

On Saturday morning at around 10am our disaster recovery strategy was called into play when both drives on a server failed simultaneously, it’s that million to one chance and thankfully we were prepared to deal with it.

Minutes after the fault occurred our technical support engineers were looking at what the problem was and what caused it and quickly identifying a solution.  They consulted with our management and the plan of recovery was put into action.

The recovery took a little longer than expected, however all websites on that server were functioning again by early evening.  Our technicians have learnt many important lessons and should be able to perform a restore quicker in future should it ever happen again.

We apologies to any customers whose website was offline during this fault but also are incredibly pleased that our careful planning and meticulous backup strategy worked so effectively.