When Theory Catches Up to Practice: What We Built Before the Map Existed

2026-06-11 · Dorian Cougias

Benedict Evans has been refining his analysis of the generative AI platform shift. His latest work maps where value accrues when foundation models become commodities – distribution, proprietary data, product design, user experience. The real innovation, he argues, happens not in the model itself but in what gets built around it.

We agree. But not because we read his analysis and nodded along.

We built the proof first.

Our research paper, "Conversational AI in Qualitative Research," embodies Evans's thesis point by point. Not because we set out to validate platform shift theory – but because solving a real problem led us exactly where his latest analysis now points. We'd read Evans's earlier contributions from last year. The foundation was there. But the specific framework he's articulating now? We discovered it through building, not reading.

When we finally encountered his current analysis, we recognized our own work staring back at us.

We Learned the Commodity Foundation Isn't Enough

Evans keeps hammering this: raw LLM capability doesn't create differentiation. The models are powerful, yes. But they're converging. The defensibility lives in the layers above.

We discovered this through iteration, not theory. The system we created "requires additional layers" beyond the breakthrough LLM itself. We recognized early that we needed to "wrap this in tooling and product." And so we did. Sentiment analysis. Topic modeling. Formality detection. Each layer built to gauge context, read trust levels, understand situational nuance before generating a response.

Our goal was never generating plausible text. It was generating text that fits the situation. Which means understanding the situation first.

Evans now articulates this as the stack shift. We found it by running into walls and building around them.

We Started with Experience, Worked Backwards

Evans quotes Steve Jobs: start with the experience and work backwards to the technology. Don't fall in love with capabilities. Fall in love with problems worth solving.

We didn't need Jobs or Evans to tell us this. The problem demanded it. We encoded The Mom Test – a specific, rigorous methodology for conducting honest interviews – into conversational behavior. The technology serves the methodology, not the other way around. The result: outputs that conventional surveys miss entirely. Struggle-attempt-failure-desire chains. The signal buried in human complexity.

Evans talks about finding the right "wrapper." We built one because nothing else worked.

We Closed the Deployment Gap Before It Had a Name

Here's where most AI projects fail. Evans now calls it the deployment gap – high awareness, low daily engagement. Consumer use stays experimental. Enterprise pilots die at a 95% failure rate.

We avoided this by refusing to build a general chatbot. One specific workflow: qualitative research interviewing. A precise tool for a precise job. Not strategy – instinct. General-purpose felt like a trap.

The validation numbers tell us it worked. 97.6% participant satisfaction. 99% recommendation rates. That's not AI enthusiasm – that's product-market fit.

Evans has been asking where the daily-use applications will emerge. We built an answer before he framed the question: in workflows where specificity replaces generality.

We Experienced Jevons Paradox Firsthand

Evans frames AI as "infinite interns" – a step change reduction in costs that triggers Jevons Paradox. Organizations do vastly more work, not just cheaper work.

We lived this before we had the economic framework for it. Our tool transforms qualitative research from episodic project work into ongoing infrastructure. The Anthropic validation involved 1,250 qualitative interviews – a task that would traditionally consume months of team effort. Qualitative depth at quantitative scale. Continuous feedback instead of quarterly check-ins.

We didn't set out to prove Jevons Paradox. We just kept asking "what else can we do now?" and the answer kept expanding.

We Proved Probabilistic Tasks Yield to Intelligent Systems

Evans draws a line. Before, computers automated rule-based tasks. Now, generative AI automates probabilistic tasks – things historically "hard to explain to a computer but easy to explain to an intern."

Qualitative interviewing sits squarely in that category. Reading tone. Sensing deflection. Detecting implicit concerns. Judgment calls. Contextual reads. The kind of work that resisted automation until now.

We reconfigured the LLM to ask questions instead of answering them. A shift in application we discovered through experimentation, not analysis.

When the Map Matches the Territory

Platform shifts generate plenty of theory. Practitioners generate proof. Sometimes they converge.

Evans's latest analysis gave us language for what we'd already built. Every element of his framework found expression in our work: value moving up the stack, experience driving design, deployment happening through specificity, efficiency triggering expansion, probabilistic work yielding to intelligent systems.

We didn't follow a map. We made the journey. And when Evans published his current analysis, we recognized the terrain.

The model was our beginning.

The product is where we found value.

And it turns out the theory was waiting to catch up.