Disaster Recovery
I thought now might be a good time for us to publish what we do in terms of backing up your data and recovering from hardware failures.
This stuff long ago got more technical than I have the patience (read:competence) for, so I’ll hand you over to Tim McOwan, our CTO.
One of the key benefits of using KashFlow is that we take care of things for you, so that you don’t have to lose sleep worrying about things like “How secure is my data?” and “What happens if disaster strikes and one of the servers fails?”
We’re proud to say we’re hosted with Rackspace, the leading provider of manged hosting solutions. Yes, they’re expensive – but we think they’re worth every penny. They guarantee 100% network uptime and can replace any faulty component in our servers within an hour. We’ve been with them for two years now and have just signed up for another two.
Hardware Failure
Let’s say a component on one of our servers goes bang. Under our contract with Rackspace, and because they’re very good at this kind of thing, they’ll have it replaced, whatever it is, in less than 1 hour.
What if it’s a hard disk that goes bang, where is the valuable data kept and how up-to-date is that data? All of our servers use what is called “RAID“. So if one hard disk goes pop, you wouldn’t even notice. Rackspace would put in a new disk within the hour and KashFlow wouldn’t even sneeze.
Back Ups
What if a Bad Thing happens? Perhaps all the disks fail at once or the datacenter explodes? There’s only one thing more agonising than doing your books, and that’s having to do them twice. So we have lots of backups of your data:
– We take (and verify) an offsite backup of the database every night as a matter of routine.
– We also take a backup of the database every 15 minutes and immediately send that data off-site to a separate datacentre at the other end of the country for secure storage.
– Additionally, we have an expensive bit of software called DoubleTake. This software replicates in real-time all of the data on our live servers to another server on the other side of the country. So if the worst ever happens there would be absolutely zero data lost
Getting back up and running
If the server or data centre has exploded – it’s great to know your data is safe. But what about accessing it and carrying on working as normal?
We have additional hardware on stand-by with our software installed on it. This can be brought online very quickly and use the data copied to it by DoubleTake.
The only problem is with DNS; it can take up to 24 hours for your ISP to redirect your request for our site to the new server.
As a work around, if we ever had to switch over to our back up servers then we would publish a new address so you can quickly access it. This address will be published on our Twitter feed and on this blog. We’d also send an email to all of our users just to be sure.
Testing
The best plans in the world can look great on paper bur fail miserably when they’re implemented. So as you would expect we regularly test our recovery procedures to make sure they actually work and do what we expect them to.
Hopefully this demonstrates to you how seriously we take this stuff, but if you have any questions at all then please email [email protected] and we’ll do our best to answer.
For more of Tim’s techie stuff, see his personal blog at devballs.com