Alex knows everything about your data platform. When Alex leaves, you’re in trouble.
You’d never deploy to production with no failover, no backup, no disaster recovery plan. But right now your data platform runs on one person’s memory. Same risk class. Zero mitigation.
Every data team has an Alex. The one who built the first pipeline. Who knows why that query uses a weird join. Who gets the 11pm call when the dashboard breaks. Alex is your single point of failure - and the blast radius when they’re unavailable is identical to losing a critical system.
I’ve seen this burn teams around me. One Alex took parental leave. A 20-minute fix turned into two weeks of reverse-engineering - broken data flowing to dashboards, missed SLAs, and a very uncomfortable conversation with the CFO about the board report.
Here’s the diagnostic: if one person gets mentioned in every incident retro, if there’s a name that shows up in every escalation path - that’s your Alex. And that’s your risk.
I used to think this was a documentation problem. It’s not. Documentation helps, but the real gap is architectural. Apply the same redundancy thinking you’d use for infrastructure: rotate ownership of critical systems quarterly. Pair by default on anything that pages. Make code explain itself (dbt docs, inline comments) instead of living in someone’s head.
Who’s the Alex on your team? What would happen if they left tomorrow?
