Why Your Agency Can't Tell You What's Working

Q: What questions should I ask my marketing agency in a quarterly review?

Five questions surface most of what you need to know: 1) Which single change had the biggest measurable impact last month? 2) What's the staged rollout plan for the next round of changes? 3) When something stops working, how do you diagnose what changed? 4) What's currently shipped that you'd consider removing if the data supported it? 5) Show me a recent decision where you deliberately waited before shipping something. If the answers to three or more of those are vague, you have your answer about whether they practice the discipline.

The short version

Most marketing agencies ship ten things in a month and report on the total. When traffic moves, they take credit for the bundle. When traffic doesn't move, they blame the algorithm. Either way, you paid for the work and didn't get the one thing it should have produced: actual knowledge about what moves your business.

The discipline that prevents this has a name. The CRO and A/B testing world has practiced it for years. The SEO, GEO, and AEO world hasn't always picked up the same habit, and that's the gap this post is about.

This is the third piece in a short series on marketing problems that quietly cost businesses money. The first was on schema cannibalization. The second was on entity confusion in AI search. This one ties them together.

The quarterly review scene

I've sat in a lot of agency quarterly reviews over 25 years. The same scene plays out almost every time.

The deck has 30 slides of activity. Traffic is up 18%. Conversions are up 12%. The CMO or CEO asks the obvious question: Great. Which of the things you did caused this?

The answer is some version of: Well, it's really the cumulative effect of everything working together. Or: SEO is holistic, so it's hard to isolate. Or the most honest of the wrong answers: We just did a lot of things and the numbers moved.

What you just heard is not a strategy. It's an admission. The agency can't tell you which of their efforts earned the result, because they shipped too many changes at once to ever know. And next quarter, when something stops working, they won't be able to tell you which thing to fix either.

There's a name for the discipline that prevents this. It's borrowed from how scientists figure things out.

What variable isolation actually means

Imagine you're trying to figure out why your sourdough won't rise. You decide to bake a new loaf. You change the flour brand, increase the hydration, swap your starter, raise the proofing temperature, and try a different oven rack. The new loaf comes out perfect.

Congratulations. You have no idea what fixed it.

Maybe it was the flour. Maybe it was the temperature. Maybe four of your changes did nothing and one fixed everything. Maybe two of them canceled each other out and a third was the hero. You'll never know. And the next time your bread fails, you have to change five variables again, because you don't know which one matters.

That's exactly how most marketing agencies operate. They batch a dozen changes into a single deploy because shipping feels like progress. When the metrics move, they take credit for the whole bundle. When they don't, they blame the algorithm. Either way, you don't learn anything you can use next month.

Variable isolation is the discipline of changing one thing (or one carefully defined set of related things) at a time, so you can actually attribute the result to a cause. It takes longer. It earns real knowledge. And it compounds across every decision after it.

A note to my CRO and A/B testing friends, who've been doing this forever

If you've ever sat in a room with someone who insisted on statistical significance before they'd call a winner, or pushed back on launching three changes at once, you've already met variable isolation. The conversion rate optimization world has practiced this discipline for years. It's the bedrock of how A/B testing works at all.

Honestly, I learned the discipline from CRO people, not from SEO people.

Back in my days running paid media for Verizon Business, we worked with the conversion optimization team at Performics. That team was wicked smart. I loved every meeting I got to sit in with them. They taught me two things I've held onto ever since.

One: be patient.

Two: don't do anything until you have statistical significance.

For an impatient person (and I am very much one), both lessons took me a while to absorb. They're still the two pieces of advice I pass along most often, to clients and to other marketers.

The reason this post exists is that the SEO, GEO, and AEO side of the marketing world hasn't picked up the same habits. Most technical SEO and AI visibility work ships without anyone asking "how will we know if this worked?" That gap is what's costing clients real money every month, and almost nobody on the SEO side is naming it.

A real example, from a few weeks ago

Two weeks ago I had two fixes ready to ship on a client's site. Both addressed the same underlying problem: the contact page was outranking the homepage for branded searches and converting at less than 1%. The homepage was converting branded searches at almost 10%. (Full story in the schema cannibalization post.)

The first fix was a canonical tag, a one-line technical instruction telling Google "the homepage is the official version." Standard, easy, well-understood.

The second fix was a complete rebuild of the structured data behind both pages, so they would stop introducing themselves to Google as the same business. Deeper work, more nuanced, harder to do well.

Both fixes were tested, validated, and ready to deploy. Most agencies would have shipped them the same day. I held the canonical and shipped only the schema fix.

Here's why.

Both fixes target the same outcome. If I'd shipped them at the same time and the metrics moved, I'd never know whether the canonical did the work, the schema did the work, or whether they reinforced each other. Worse, I'd never know whether one of them was actually unnecessary. If the schema fix alone resolves the problem, the canonical can come off in a future cycle and the homepage will still get the credit. That matters, because canonicals carry small SEO costs of their own, and unnecessary ones are technical debt I don't want my clients carrying.

So the plan: ship the schema fix, wait four to twelve weeks for results to stabilize, then evaluate. If the schema fix did the job alone, pull the canonical. If results are partial, leave the canonical in place and we now know we need both. Either way, the client gets actual knowledge, not a vague "things are better."

That's a small example. The principle scales to every marketing decision worth making.

Every time you batch changes, you trade the ability to learn for the appearance of momentum.

This isn't just an SEO problem

Variable isolation isn't a technical SEO concept. It's a discipline that applies everywhere you're trying to learn from your work, and it gets violated in almost every marketing function I've watched.

Where it gets violated	What teams do	What it costs
Paid media	Launch three new audiences, two new creatives, and a bid strategy change in the same week	You can't tell what's working, so the next round is guessing too
Email	Change subject line, sender name, send time, and CTA copy in one A/B test	You get a "winner" and zero understanding of why
Content and SEO	Publish a new pillar page, update internal links across 40 pages, and change the navigation in one sprint	Rankings move, you can't attribute, the playbook is non-transferable
CRO	Redesign the hero, change the form, and adjust the offer in one release	You know the new page converts better. You don't know which element earned the lift

The common pattern: every time you batch changes, you trade the ability to learn for the appearance of momentum. Some teams do this because they don't know better. Others do it because their billing model rewards activity over insight. Either way, you pay for the work and don't get the knowledge it should have produced.

Five questions for your next agency meeting

These are the five I'd bring to your next review. You don't have to be a marketer to ask them. You just have to recognize the difference between a real answer and a deflection.

For the changes you shipped last month, which single change had the biggest measurable impact?
A real answer cites a specific change and the metric it moved. A vague answer signals batched deploys.
What's the staged rollout plan for the next round of changes?
A real answer includes which changes go first, what success looks like, and when the next one will deploy. A vague answer signals everything going out at once.
When something stops working, how do you diagnose what changed?
A real answer describes a process. A vague answer means they'll guess.
What's currently shipped that you'd consider removing if the data supported it?
A real answer names something specific. A vague answer means nothing they've ever shipped is up for review.
Show me a recent decision where you deliberately waited before shipping something.
A real answer exists. If they can't think of one, they don't practice the discipline.

Ask these five in your next quarterly review. If three or more come back vague, you know enough to start asking harder questions. Or to start looking elsewhere.

Why most agencies skip this

Variable isolation costs the agency something. It slows the deliverable cadence. It makes the monthly report look thinner. It requires saying "we deliberately held this one back" in a status meeting, which sounds like a delay to anyone judging by velocity.

It also requires the agency to be okay with finding out that some of what they shipped didn't work. Most aren't. The whole business model of agency retainers depends on the perception of continuous, valuable work. Pausing to learn from what already shipped looks, from inside the agency, like an admission. So they keep shipping, the months stack up, and the question "which of these earned the result" stops getting asked.

The agencies worth hiring are the ones who'll deliberately go slower so the results mean something. The ones who'll tell you "we held this fix for the next cycle so we can isolate the impact of this one" instead of "we shipped twelve things this month." The ones who can show you, six months later, exactly which work earned the result and which work didn't.

Where I land

Most agencies will tell you they're "data-driven." Almost none of them actually are. Being data-driven means making decisions you can learn from, not just decisions you can defend after the fact.

This is the third post in a short series on the kinds of marketing problems most agencies miss. The first was about a hidden technical SEO problem that splits your branded traffic between competing pages. The second was about how AI systems like ChatGPT and Perplexity are quietly misrepresenting businesses to potential customers. All three share a common argument: the most valuable marketing work is the work most agencies skip, because it's invisible, slow, or unprofitable for them.

That's the work I do.

Frequently asked questions

What is variable isolation in marketing?

Variable isolation is the discipline of changing one thing at a time so you can attribute the result to a specific cause. In marketing, it means shipping one meaningful change (or one carefully defined set of related changes), waiting for results to stabilize, and only then deploying the next change. The point is to build knowledge that compounds, instead of activity that doesn't.

Why don't most marketing agencies practice variable isolation?

Two reasons. First, it slows the deliverable cadence and makes monthly reports look thinner, which feels uncomfortable when the client is judging by visible velocity. Second, the agency's billing model often rewards activity over insight, so shipping more changes feels like producing more value, even when the value can't be attributed.

How do I know if my SEO is actually working?

Ask your agency to point to one specific change they shipped recently and the metric it moved. A real answer cites the change and the metric. A vague answer ("it's holistic," "everything works together") means they shipped too many changes at once to know. The same test works for paid media, email, content, and CRO.

How long should I wait between SEO changes to see what's working?

For SEO and AI visibility changes, four to twelve weeks is usually right, depending on how much traffic the page gets and how big the change was. Lower-traffic pages need longer to produce a directional read. Higher-traffic pages can show signal faster. The instinct to ship the next change immediately is almost always wrong.

Can variable isolation work on a low-traffic site?

Yes, with some adjustment. Traditional A/B testing needs enough traffic to hit statistical significance, which most local and B2B sites don't have. The alternatives are sequential testing (make the change, compare to the prior period), cohort comparisons (same source before and after), and long-running directional reads (60 to 90 days of trend watching). The discipline is the same. The math is different.

What questions should I ask my marketing agency in a quarterly review?

Five questions surface most of what you need to know:

Which single change had the biggest measurable impact last month?
What's the staged rollout plan for the next round of changes?
When something stops working, how do you diagnose what changed?
What's currently shipped that you'd consider removing if the data supported it?
Show me a recent decision where you deliberately waited before shipping something.

If the answers to three or more of those are vague, you have your answer about whether they practice the discipline.

What's the difference between variable isolation and A/B testing?

A/B testing is one method of variable isolation, where two versions of something run simultaneously so you can compare results. Variable isolation is the broader principle: change one thing at a time, in whatever way the situation allows. On a low-traffic site without the volume for clean A/B tests, variable isolation still applies. You just use sequential testing or directional reads instead.

Why Your Agency Can't Tell You What's Actually Working

The quarterly review scene

What variable isolation actually means

A note to my CRO and A/B testing friends, who've been doing this forever

A real example, from a few weeks ago

This isn't just an SEO problem

Five questions for your next agency meeting

Why most agencies skip this

Frequently asked questions

"We did a lot of things" isn't a strategy.
It's an admission.

The quarterly review scene

What variable isolation actually means

A note to my CRO and A/B testing friends, who've been doing this forever

A real example, from a few weeks ago

This isn't just an SEO problem

Five questions for your next agency meeting

Why most agencies skip this

Frequently asked questions

"We did a lot of things" isn't a strategy.It's an admission.

"We did a lot of things" isn't a strategy.
It's an admission.