What My Research Taught Me About Learning with AI
On concept fluency fallacy, the expertise paradox, and why AI chatbots need a dedicated learning mode
This story’s voiceover is by my cloned voice from ElevenLabs.
For ten minutes, a Cambridge master’s student sat down with Claude to study Plato’s Allegory of the Cave.
I then handed him a quiz to test his understanding. He aced the recall questions. On the application questions, he struggled. Between the study session and the post-test debrief, his confidence score had dropped two points.
He was the third of four people I watched do almost exactly the same thing.
If you’ve used Claude or ChatGPT to study something recently, you’ve probably had a version of this experience where the chatbot’s explanation feels crystal clear to you in the moment, you nod along, and you close the tab feeling smarter. Only to then sit down to recall or apply the knowledge shortly after and you find yourself underprepared.
It’s not you. It’s that AI is exceptional at making you feel like you’ve learned something but not as great at making sure you actually have. To build durable understanding, you still have to get active in your own learning.
I arrived at this conclusion from a class I took at Stanford last fall called ‘Technology for Learners’ taught by Professor Candace Thille, who’s been building the design infrastructure behind edtech for over two decades.
She founded the Open Learning Initiative at Carnegie Mellon and later at Stanford, co-authored the U.S. National Education Technology Plan, and even left academia for a stint at Amazon as Director of Learning Science and Engineering, where she was responsible for how 1.6 million employees learned on the job.
The capstone for her class was an ALT (Analysis of a Learning Tool) where we picked a tool, ran a study, and asked whether it helped people actually learn.
I chose an AI tool because, at Stanford, AI is choking us out. It writes emails, automates our lives, matches us with dates, and (even though our honor code forbids it) still helps people with assignments and, yes, exams.
To the point that the business school has brought back handwritten exams for some courses. Yup, it’s that bad.
But can we really blame students for using AI when their teachers are using it too, as Anthropic’s recent study shows?

As a Claude ambassador on campus, I needed to know if this thing I’m promoting is actually helping us learn or just liquefying our brains. Also because Claude was still lesser known on campus compared to Gemini and ChatGPT at the time, it was an opportunity to get more people to try it out.
So for my ALT, I focused on a specific question: does Claude support durable learning, the kind that sticks with you when you need to use it to solve problems?
I observed 4 people (the Cambridge graduate student, a Stanford law student, and two young professionals in big tech and venture capital), each over a 10-minute study session on Plato’s Allegory of the Cave.
Then I gave them a test mapped to Bloom’s Taxonomy, covering everything from basic recall to applying the knowledge in new contexts. Afterwards, we debriefed.
I chose a philosophical text because I felt it’d have the right balance of abstraction and familiarity for a group of motivated young learners to feel fairly challenged.
Here’s what I found.
The Illusion of Competence
Before AI, learning had natural friction built in.
You had to find the textbook, parse the dense prose, re-read the confusing paragraph, put it down, come back, explain it to a friend, get it wrong, and try again.
This is what cognitive scientists call “desirable difficulties,” the productive struggles that force our brains to actually encode information rather than just passively absorb it. As my friend Lenore writes, if you want to learn durably, you can’t afford to skip the reps.
But Claude removes that struggle almost entirely by being extremely good at explaining things clearly and going no further. But for durable learning, some friction is necessary.
Back to the Cambridge student.
He was the most prepared participant in my study by a wide margin. A long-time Claude power user with a custom communication style for his Claude pre-configured to challenge him while he learned. He came into the study session ready to ace the test, but not the learning.
On recall, he scored the highest. He could name the core elements of the allegory, identify what the prisoners represented, and describe what the journey outside symbolized. But when asked to construct his own allegory using Plato’s framework, pressed for time, he struggled. By the time we got to the debrief, his self-rated confidence in the material had dropped two points from where it was right after his study session.
The problem was that what he’d actually learned was Claude’s clean, coherent, well-organized explanation and not his own original mental model of the allegory itself.
Three of the four participants observed the same pattern of a clear mismatch between how confident they felt after the 10-minute session with Claude and how much they’d actually learned after they reflected on the learning session post-test.
Claude created what I called concept fluency fallacy, the feeling that the material is clear, coherent, and familiar even when your underlying mental model—the real substrate of learning—is still not yet fully formed.
This led me to the insight that the litmus test for whether you’ve actually learned something with AI isn’t whether you can recall it afterward. It’s whether the session leaves you able to ask more sophisticated questions about the topic than you could before.
For me, questions come from noticing a gap between a topic and my mental model of it, then making a conscious effort to close that gap. This is why I see asking questions, the foundation of Socratic inquiry, as the hallmark of active learning.
The Expertise Paradox
Another thing I found in my research is that learning with AI is far easier (and far more effective) if you already know a lot about the thing you’re trying to learn.
Weird, right?
But that’s because the quality of what you get from Claude is almost entirely a function of the quality of the questions you ask; and the quality of your questions is almost entirely a function of how much you already know.
Also, true experts on a topic are less likely to mistake concept fluency for real learning because they have a working mental model to measure against.
If learning is what happens when the unknown becomes known, then the fastest way to learn is to use what you already know as scaffolding for what you don’t, using analogies to link unfamiliar concepts onto familiar contexts and mental models.
The tech professional in the study showed me this in real time.
During his session, he noticed Claude making connections between Plato’s allegory and feminist epistemology. He got curious. If Western feminist theory could help explain Plato, then African philosophy should be able to as well.
He then asked Claude to explain the allegory through the lens of Ubuntu, the African concept of interconnectedness. Claude ran with it and engaged him in a rich, cross-cultural dialogue that let him use his prior mental model as scaffolding to understand the concept more deeply.
To no one’s surprise, he had the highest nominal and confidence scores after being tested on the material.
What This Means for How We Learn
Because my study had only four participants, this is far from exhaustive.
But the patterns I noticed were consistent enough to take seriously, and they point toward a product design problem that makes learning with AI less durable than it could be.
Claude, like every other AI chatbot, currently conflates the two fundamentally different jobs in a single interface: problem-solving and learning.
The uniformity of the experience subconsciously encourages shallow interaction patterns, which is fine when you need a fast answer, but not when you’re trying to build deep, durable understanding on a topic.
The theory of change that emerged for me from this research is that if Claude proactively built desirable difficulties into a dedicated learning mode, the people learning with Claude would be better positioned to achieve real, rather than performative, understanding.
What could that look like in practice? Explain-back prompts, concept retrieval checkpoints that force you to apply what you’ve learned in contexts you haven’t seen, a persistent learning context that remembers what you’ve studied across sessions and nudges you to review. Scaffolding for novices that nudges them towards analogies so that the tool doesn’t just reward expertise.
Right now, Claude is an excellent thinking engine. To become a true learning partner, it has to start forcing us to think harder while we learn, and use what it knows about us to proactively connect unfamiliar concepts to familiar ground.
The Cambridge student was pissed because the test showed that learning with AI hadn’t helped him as much as he thought it had.
That gap between the feeling and reality of competence is one of the most important UX problems in AI right now. But it’s also one of the most solvable.
What practices have you adopted that help you learn effectively with AI?
This week’s content catalog
📚: Siddhartha by Hermann Hesse (this is part of taking a class this quarter where we get to read a book a week and discuss how we want to show up spiritually/morally in business and in life.)
🎶: ‘Kehlani’ - Kehlani’s latest album (it’s crazy how half the reason I love Kehlani as much as I do is because a friend I love loves her so much.)
🎬: BEEF (Season 2) on Netflix (I just watched BEEF for the first time last week. Don’t crucify me. But, yes, I completely get the hype!)




Love the analysis and thought process & ALT application 🤔
4 people sample size is still “small” but great anecdotal initial evidence to guide a larger study 👏🏾
Also always impressed to test whether what you are “selling” is also delivering value?
Lovely take! 💯!