When "It’ll Never Happen" Happens


About eight years ago, when I was still a QA, Microsoft Azure “lost” our primary database. Without it, we were basically out of business - it was the main source of truth for, well, almost everything. I don’t remember exactly what the database held anymore, but I do remember the chaos that day. And the stress. A lot of it.

Today, I saw a tweet about how the Korean government had all its data in a single location, with no backups.

It reminded me: we all know this lesson, but we keep relearning it the hard way.

I’m sure you’ve lost a phone before and suddenly realized half your photos were gone because you didn’t have a backup. I have too. Since then, I’ve synced everything to Google - yes, there’s a privacy tradeoff, but that’s not the point here.

The point is this: if you’re building a service someone depends on, you want backups. Period.

And you also want to protect your data from accidental deletion. I learned that the hard way too - I once deleted my Amplify sandbox for oneiras.com, not realizing it would also wipe the DynamoDB table storing my dreams. When I recreated the sandbox and the login screen told me “account doesn’t exist,” that was my “oh no” moment.

So, let’s talk about how not to repeat that mistake.


1. Turn On Deletion Protection

Start simple. Turn on deletion protection (sometimes called “termination protection”).

If you use the AWS console, go to DynamoDB → Tables, select your table, and choose Actions → Turn On Deletion Protection.

If you manage infrastructure as code, just Google “DynamoDB deletion protection” along with your tool name (Terraform, CDK, etc.). Then actually do it.

The best part? It costs nothing. But it could save you a lot of gray hair later.


2. Enable Point-in-Time Recovery (PITR)

Even with deletion protection, things can go wrong. Maybe you accidentally delete the wrong database (it happens), or your protection gets overridden.

That’s where Point-in-Time Recovery (PITR) comes in. It automatically backs up your data, so you can restore it to any second in the past 35 days.

To enable it, go to DynamoDB → Tables → Actions → Update Settings → Backups, then toggle on Point-in-Time Recovery.

It costs around $0.20 per GB/month according to AWS pricing. The bigger your table and indexes, the more you’ll pay - but that cost is nothing compared to losing production data.

Backups aren’t glamorous, but neither is explaining to your boss why you lost everything.

3. Create a Cross-Region Replica

Backups are great, but restores take time. And sometimes, entire regions go down. (Yes, it happens. I’ve seen it.)

Each AWS region has multiple availability zones - think of them as data centers within driving distance - and DynamoDB automatically replicates across those behind the scenes. That’s usually enough.

But “usually” isn’t “always.” I’ve seen regions go dark due to bad code pushes, DNS issues, or worse. That’s why you should create a replica in another region.

For my team, we operate mainly in us-west-2 (PDX) and replicate to us-east-1 (IAD).

To set that up: DynamoDB → Tables → Actions → Create Replica, choose your region and consistency, and hit Create.

It’s not free, but it’s worth it. Here’s the pricing if you want the details.

Replication is your Plan B for when “that’ll never happen” inevitably does.

4. Learn from the Pros (Even When You Don’t Agree)

Before any new AWS service launches globally, it has to pass an Operational Readiness Review (ORR) - a checklist of 100+ questions covering alarms, throttling, backups, security, and more.

I don’t agree with everything on that list, but the backup questions? Absolutely essential.

These steps - deletion protection, PITR, cross-region replication - are all on that checklist. And for good reason.

They’re easy to set up, cheap to maintain, and might just save your job someday.


Final Thoughts

If you’re running anything that stores user data, take an hour today to protect it.

Turn on deletion protection. Enable PITR. Add a replica.

Because if you don’t have a backup, it’s not if you’ll need one - it’s when.

(And yes, I still use Google Photos to back up my pictures, so at least my travel photos are safe.)

Have you ever lost data because you didn’t have a backup? I’d love to hear your story - misery loves company.

Cheers!

Evgeny Urubkov (@codevev)

600 1st Ave, Ste 330 PMB 92768, Seattle, WA 98104-2246
Unsubscribe · Preferences

codevev

codevev is a weekly newsletter designed to help you become a better software developer. Every Wednesday, get a concise email packed with value:• Skill Boosts: Elevate your coding with both hard and soft skill insights.• Tool Tips: Learn about new tools and how to use them effectively.• Real-World Wisdom: Gain from my experiences in the tech field.

Read more from codevev

Last week, I ran into this tweet: the tweet It kinda triggered me. Why would someone pay $0.40 per secret per month when you could just use AWS Parameter Store and store them as SecureStrings FOR FREE? That’s what I use for oneiras.com, so I was determined to find out if I’d missed something. Am I unknowingly paying per secret? Or is there actually a reason to use AWS Secrets Manager instead? Turns out, there are a couple, but only if you really need them. The Big One: Automated Secrets...

Well, the global AWS outage happened just four days after I sent a newsletter about COEs and how “nobody gets blamed.” Great timing, right? I wish I could’ve been in the weekly global ops meeting to see the temperature in the room. That’s the one where teams present their recent issues and learnings. I can only imagine how lively that one must’ve been. Turns out the culprit was a DNS failure in the Amazon DynamoDB endpoint in the us-east-1 region. And while that sounds region-specific, it...

Someone pushes a new feature to prod the same day you go on-call. Hours later, your phone goes off - not a gentle buzz, but a full-blown siren that could wake up the entire neighborhood. You open the alert, and it’s for a feature you didn’t even touch. Maybe it’s unhandled NPEs, maybe something else. Doesn’t matter. You’re the one on-call, so it’s your problem now. When Things Break In those moments, it’s usually faster to just debug and fix it - even without full context. I’m pretty good at...