Over the past couple of weeks I’ve had the pleasure of getting to know the founder of Agile Planner, Graham Ashton. Graham has built a fail-proof database backup routine to protect his customers’ data. Like most sysadmins, Graham takes the security of customer data very seriously, and he’s using Dead Man’s Snitch to keep tabs on his automated backups. In fact, backups are one of the top snitched-on tasks by DMS users.
Here’s Graham’s story:
Agile Planner is a planning tool for teams using iterative development, such as XP or Scrum. Agile Planner helps users turn a stack of estimated cards into a plan they can deliver on. I’m using Dead Man’s Snitch to guarantee that customer data is securely backed up off site, multiple times each day.
The site is hosted on Heroku, which runs on Amazon’s cloud services. While we benefit from the Heroku Postgres team’s expertise to keep our databases running smoothly, there’s a small risk that Agile Planner would be affected by downtime at Amazon. I wanted to ensure my customers would still be able to access their data even if Amazon went down.
My solution was to copy a backup of the Agile Planner database to a dedicated VPS, several times a day. Storing backups on multiple service providers reduces Agile Planner’s dependence on third parties, whose downtime is outside of my control.
Ensuring that the backup job runs successfully multiple times per day is where Dead Man’s Snitch comes in. DMS alerts me promptly if the backup script fails to run successfully. I’m confident that I can quickly move the site to an alternative hosting company should the unthinkable happen over at Heroku.
If you implement something similar on your site, don’t forget to regularly test your restoration procedure so you’ll know your backups will actually work when you need them!
To learn more about Agile Planner, visit www.agileplannerapp.com.
If you’re familiar with crontab in Linux, there’s a good chance you’re equally familiar with the infamous cron job silent failures. Many of us sysadmins and developers have experienced these failures without knowing before it’s too late. Automated backups and sending monthly emails aren’t always as automated (or on time) as we tend to think. Herein lies the problem.
My cron jobs send me an email when they run…isn’t that enough?
That can be true, but is there REALLY any value in knowing that they ran? Isn’t that why you created the cron job in the first place–so that it does its job? Sure, receiving an email of cron output after it runs is great. However, the value lies in knowing when your cron jobs FAIL to run (or are delayed). Then, you can investigate and fix the problem before it’s too late. Not convinced? Consider this example from a long time DMS user, Kareem Mayan, co-founder of SocialWOD.com.
A little background:
“At SocialWod.com we do workout tracking for gyms. When a new workout is emailed to us from a customer (in the form of a photo of a whiteboard, which has the workout and results), we put the data online. Once it’s online, we email that gym’s clients telling them new workout results have been posted.
Great. Where’s the problem?
"When Delayed Job (DJ) failed silently, we wouldn’t know until me or my co-founder was prompted to look based on seeing something funny, e.g. seeing a Stripe email about a new customer signup but NOT seeing the automated welcome email to the new customer (sent by our system… which was waiting in the database, ready for DJ to pick it up, which would never happen because that process had died).”
"The result would often be several days of emails (THOUSANDS of emails) queued up until one of us manually restarted DJ. This sucked because customers would either get a ton of emails in once, and some would be days old, or we would delete those emails before they got sent. This also sucked because customers would never get notifications about their posted results.”
How did Dead Man’s Snitch help?
"Using Dead Man’s Snitch made that problem go away. Now, if DJ dies, DMS never gets pinged, and we get an email as soon as that happens. At most we’ll go five minutes - not days - before knowing that we need to kick DJ into action again.”
If you can relate and want the peace of mind in knowing right when your cron jobs fail, give Dead Man’s Snitch a try and sign up for free. After all, your first snitch is on us. However, if you can’t relate but you actually made it to the end, I applaud you. If this topic doesn’t relate to you, there’s a good chance your computer friends, IT department or website managers would. Do them a favor and pass this on.Happy Snitching!