Guide
How to manage 100+ client domains without spreadsheets
Spreadsheets work fine for five client domains. At fifty, they become a liability. Here's why the spreadsheet approach to domain monitoring breaks down, and what to replace it with.
Every agency starts the same way. One spreadsheet, one tab per client, a few columns for domain, SPF status, DKIM status, last checked date. Someone on the team checks each domain manually every month or so, updates the cells, and moves on.
This works fine for five clients. It works, with some grumbling, for fifteen. Somewhere between thirty and fifty client domains, it stops working entirely not because anyone made a mistake, but because the approach was never designed to scale, and the failure mode is invisible until it isn't.
Why the spreadsheet approach feels fine right up until it doesn't
A spreadsheet of domain statuses has a specific property: it's only as accurate as the last time someone updated it. At five clients, "the last time someone updated it" is probably last week, because checking five domains takes twenty minutes.
At fifty clients, checking every domain manually takes hours hours that compete with every other task on a team's plate. So the checks get less frequent. Monthly becomes "whenever someone has time," which becomes quarterly, which becomes "we'll check if a client mentions a problem."
The spreadsheet still exists. It still has columns and dates. But the dates are old, and nobody's actively aware of how old, because the spreadsheet doesn't tell you what it doesn't know. A cell that says "SPF: Pass checked Oct 3" looks identical whether it's October 10th or February.
The three failure modes that show up at scale
Staleness becomes invisible. A status that was true three months ago is presented with the same confidence as a status that was true this morning. There's no visual or systemic distinction between "verified recently" and "verified a while ago and probably still fine, probably."
Nobody owns the gaps. With one person checking five clients, that person knows all five intimately. With a team checking fifty clients across multiple spreadsheets or tabs, ownership gets diffuse. Everyone assumes someone else has eyes on a given domain, especially after team changes when an account manager leaves, their client list's monitoring cadence often leaves with them, informally, even if the spreadsheet itself stays.
Detection time scales linearly with client count, but team time doesn't. If checking one domain takes five minutes, fifty domains take over four hours every cycle. Teams don't get four extra hours a week as they add clients. So either the check frequency drops, or the check thoroughness drops (a quick glance instead of a full review), or both.
What "100+ domains" actually requires
The jump from "a few dozen domains, manually checked" to "100+ domains, reliably monitored" isn't a matter of working harder or hiring someone dedicated to checking spreadsheets. It requires a different model entirely: automated, continuous checks that run on their own schedule, with humans involved only when something changes.
This shifts the team's role from checking to responding. Nobody needs to verify that 97 domains are still fine the system already knows that, because it checked this morning, and the morning before, and will check again tonight. The team's attention goes to the 3 domains where something actually changed.
What this looks like in practice
A single source of truth. Every client domain lives in one system, not scattered across team members' personal spreadsheets, onboarding docs, and tribal knowledge. When someone joins the team, there's one place to see the full client domain list and its current health not five.
Status that reflects reality right now, not "as of last check." Instead of a cell that says "checked Oct 3," the system continuously re-verifies, so the displayed status is always current or flagged as needing attention if a check hasn't run recently for some reason (an API outage, a DNS resolution failure).
Alerts instead of audits. Rather than scheduling time to go check fifty domains, the team gets notified when one of those fifty domains changes state a DKIM record disappears, a domain gets blacklisted, a DMARC policy gets overwritten. The default state is "nothing to do," and that default is trustworthy because it's continuously re-verified, not assumed.
Health scores that aggregate, not just list. At fifty-plus domains, even a clean list of statuses is a lot to scan. A health score per domain and an at-a-glance view of which domains are trending down turns "review fifty rows" into "look at the three domains with declining scores."
History, not just current state. When a client asks "has this been a problem before," the answer shouldn't depend on someone's memory or a buried Slack message from four months ago. A system that tracks domain health over time can answer that question directly, which matters both for troubleshooting and for demonstrating value during renewal conversations.
The transition point
There's no exact client count where a spreadsheet "stops working" it's a gradual erosion of accuracy that's hard to notice from the inside, because the spreadsheet still looks the same whether it's accurate or six months stale.
The signal to watch for isn't client count directly. It's this question: if a client's domain had a DKIM failure right now, how long would it take the agency to find out and would they find out from the spreadsheet, or from the client?
If the honest answer involves "whenever we next get around to checking" or "probably when the client notices," the spreadsheet has already stopped working, even if it still has all the right columns.
The fix isn't a bigger spreadsheet, or a more disciplined checking schedule, or a dedicated hire to do manual checks faster. It's removing the manual check from the critical path entirely so the question "is everything okay?" has an answer that's continuously true, not periodically verified.