On paper, 70% accuracy sounds… fine, I guess. Better than a coin toss. Good enough to “get started,” right?
But in UX, it’s not just inadequate—it’s actively dangerous. And I think a lot of teams are realizing this the hard way.
This uncomfortable truth came up in a recent Nielsen Norman Group UX Podcast conversation featuring Christian Holst and Jamie Holst from Baymard Institute, alongside Kate Moran from Nielsen Norman Group.
What they unpacked is something many teams are quietly struggling with right now: AI tools are getting very good at sounding right—without being reliably right.
And in UX, that gap matters more than most people realize. Maybe more than in any other field, actually.
The 10-Suggestion Trap
Imagine this scenario. You run your interface through an AI UX audit tool. It gives you 10 recommendations. Seven are genuinely good. Three are subtly but seriously wrong.
Here’s the real problem: you can’t tell which is which.
And if you could reliably tell them apart… You wouldn’t need the AI tool in the first place. That’s the circular logic trap we’re in.
That’s not a hypothetical risk, by the way. That’s the core issue with AI-driven UX analysis today.
UX is full of small decisions with outsized impact. A thumbnail vs dots under a product image. Button placement on a checkout page. Error handling copy that appears once every 200 sessions. One “minor” UI detail can shift conversion by millions of dollars at scale.
So when an AI tool is wrong 30% of the time—even politely wrong—it can quietly cancel out the gains from everything it got right. You move fast. You ship confidently. And you end up exactly where you started, or worse.
Why This Is a UX-Specific Problem
In many fields, a 70% success rate might be acceptable. Perhaps even good. UX is different, though.
Because UX decisions are interconnected, context-dependent, and often irreversible once scaled, a bad recommendation doesn’t just fail. It can actively degrade performance in ways that ripple outward.
Baymard shared real examples during the podcast. Replacing tiny dot indicators with image thumbnails on a product page increased conversion by 1% for a Fortune 500 retailer.
Duplicating the “Place Order” button at both the top and the bottom of checkout? That generated a $10M annual revenue lift.
Now imagine an AI confidently suggesting the opposite—because “minimalism” sounds cleaner. It sounds smart. It sounds reasonable. It’s catastrophically wrong.
The Accuracy Question Nobody Is Asking
One of the most essential points from the discussion, and honestly something I hadn’t thought about enough: most AI UX tools don’t publish their accuracy rates at all.
And when accuracy is measured, recent independent studies show many tools land between 50% and 70% accuracy.
Higher accuracy dramatically reduces what the tool can safely evaluate, creating a weird tension between comprehensiveness and reliability.
Baymard made a deliberate decision with their tool UX Ray: to cover fewer heuristics, but deliver around 95% accuracy.
That choice wasn’t easy—or cheap. Documenting accuracy alone costs six figures, which is wild when you think about it.
But it protects something far more valuable than features or hype. Trust.
Why “Looks Right” Is the Most Dangerous Phrase in UX
Generative AI excels at producing outputs that are well-phrased, confident, and structurally convincing. They’re shaped like good insights.
But UX isn’t about sounding correct. It’s about being accurate in context, and that’s a much harder bar to clear.
When teams implement AI-generated UX suggestions without deep validation, they often say: “It looked right, so we shipped it.”
That’s how organizations end up iterating endlessly, chasing trends, burning credibility with leadership, and eventually concluding that “UX doesn’t really work.” Not because UX failed—but because bad tools were trusted too early. I’ve seen this happen more than once, actually.

Acting Like a Professional in an AI World
One of the strongest ideas from the episode was simple, but it stuck with me: a professional owns outcomes—not tools.
Using AI isn’t unprofessional. Blindly trusting it is.
Professionals reduce uncertainty where it matters. They understand acceptable risk.
They know when speed is worth it—and when it isn’t. That’s the judgment call that separates real expertise from just following instructions.
That’s why Baymard intentionally limits AI’s role in UX-Ray. AI handles classification (what pattern is present).
Humans define judgment (is this good, harmful, or risky). Probabilistic systems are used where they’re strong. Deterministic logic is used where correctness matters.
This also makes mistakes visible—so humans can catch them before damage is done, which seems obvious but is surprisingly rare in how most tools work.
The Quiet Risk for Junior Designers
There’s a more profound concern here, and perhaps it’s the one that worries me most.
AI UX tools are most often used by junior designers, non-UX specialists, and teams without research support. The very people are least equipped to spot bad advice.
Heuristic evaluations, guideline application, and expert reviews are how UX instincts are built.
Outsourcing that thinking too early risks creating a generation that executes without understanding, ships without confidence, and can’t explain why a decision was made.
That’s not a tooling problem. That’s a professional development problem. And it might take years to show up as a real issue.
So… Can AI Replace UX Research?
Not yet. Not even close, honestly.
AI can accelerate observation, support pattern detection, and reduce grunt work. But judgment, context, and accountability are still human responsibilities.
And until AI tools can clearly say, “Here’s what I know, here’s what I don’t, and here’s how often I’m wrong,” they should be treated as assistants, not authorities.
The One Question Every Team Should Ask
Before adopting any AI UX tool, ask this: “What is your documented accuracy rate—and how was it measured?”
If there’s no answer? That’s your answer.
Final Thought
AI isn’t killing UX. Bad UX decisions are.
Use AI like a professional. Demand evidence. Own the outcome.
That’s how UX survives—and improves—in the age of automation. Or at least, that’s what I’m hoping for.







































