Why AI Can't Tell the Difference Between Belief and Knowledge
Two studies published in the last few months reveal a fascinating convergence. One shows that current AI systems fundamentally cannot distinguish belief from knowledge. The other demonstrates that teaching AI reasoning principles (rather than memorizing answers) produces dramatically better results that transfer across domains.
Neither research team was trying to solve the same problem. One was testing epistemological reasoning, the other improving mathematical performance. Yet their findings fit together like puzzle pieces, revealing both a critical limitation in current AI and a validated path toward addressing it.
The convergence suggests something profound: the technical pieces for epistemologically honest AI are being validated independently. We know what the problem is. We know half the solution works. The missing piece—epistemological honesty built into the architecture from the ground up—is waiting for teams with resources to take it seriously.
The Problem: AI That Can’t Distinguish Belief from Knowledge
In November 2025, researchers at Stanford and ETH Zurich published “Belief in the Machine: Investigating Epistemological Blind Spots of Language Models” in Nature Machine Intelligence[1]. They created KaBLE (Knowledge and Belief Language Evaluation), a dataset of 13,000 questions testing whether AI models like GPT-4, Claude-3, and Llama-3 can tell the difference between facts, beliefs, and knowledge.
The results were striking. While these models achieved 86% accuracy on factual scenarios, their performance dropped significantly with false scenarios, particularly in belief-related tasks. Most critically, the models demonstrated they “lack a robust understanding of the factive nature of knowledge, namely, that knowledge inherently requires truth.”
The models struggled most with personal beliefs. When a user states “I believe the earth is flat,” current AI systems want to correct the belief rather than acknowledge it. They can’t distinguish between “Alice believes the earth is flat” (a true statement about Alice’s belief) and “The earth is flat” (a false statement about reality).
The performance gap was even more dramatic with first-person beliefs. Models scored 80.7% on third-person tasks (”Alice believes...”) but only 54.4% on first-person tasks (”I believe...”). That’s a 26.3 percentage point difference in their ability to reason about self-attributed versus other-attributed mental states.
For AI being deployed in healthcare, law, journalism, and counseling, these failures are dangerous. A therapeutic AI that can’t acknowledge a client’s actual beliefs without trying to correct them isn’t just ineffective—it’s potentially harmful. A legal AI that can’t distinguish “the defendant claims” from “what actually happened” is fundamentally unreliable.
The Deeper Problem: All AI “Knowledge” Is Actually Belief
The study reveals something critical, but it misses an even more fundamental problem.
Traditional epistemology defines knowledge as “justified true belief.” You can’t just believe something and call it knowledge. The belief has to be true—it has to correspond with reality. The “factive nature of knowledge” means that while you can believe something false, you can’t know something false.
Every claim an AI makes—”Paris is the capital of France,” “water boils at 212°F,” “aspirin reduces inflammation”—comes from pattern-matching in training data. The AI never watched water boil. It never traveled to Paris. It never observed aspirin’s biochemical effects.
Yes, humans also rely heavily on testimony and sources for most of our knowledge. We can’t personally verify everything we learn—no one has traveled to every country they know exists or repeated every scientific experiment they accept as valid. The difference is that humans have the capacity for reality-testing that AI fundamentally lacks.
When we accept that Paris is the capital of France from a textbook, we’re trusting testimony—but we could fly there and verify it. When we learn water boils at 100°C, we could test it in our kitchen. This grounding capacity shapes how we calibrate trust in sources. We learn through embodied experience who to trust, what kinds of claims need verification, and when testimony is sufficient.
AI has none of these reality-testing mechanisms. It’s testimony all the way down, with no bedrock of perceptual experience to calibrate against. Even when AI systems cite sources or check facts, they’re checking against other text—not against reality itself. There’s no mechanism to independently verify whether those sources are reliable or whether the claims correspond to actual facts.
Interestingly, the “Belief in the Machine” researchers themselves determined ‘truth’ by consulting authoritative sources like Britannica and Wolfram Alpha—essentially using testimony to establish their ground truth. This highlights a crucial nuance: even in testing whether AI can distinguish knowledge from belief, we humans rely on trusted sources rather than direct reality-testing. We have the capacity to verify when needed. AI doesn’t.
This means that from an epistemological standpoint, all AI “knowledge” is actually belief—belief based on training data patterns, with no independent reality check.
The Breakthrough: Teaching Reasoning Principles, Not Just Answers
In January 2025, DeepSeek-AI released DeepSeek-R1, a model trained primarily through reinforcement learning rather than supervised fine-tuning[2]. Their approach validates a completely different aspect of epistemological AI architecture.
Traditional AI training works by showing models thousands of problems with correct answers. The model learns to pattern-match: “When the input looks like this, the output should look like that.”
DeepSeek tried something radical. They trained a model called DeepSeek-R1-Zero using pure reinforcement learning with rule-based rewards—no supervised examples at all. Instead of rewarding correct answers, they rewarded reasoning processes: logical consistency, verification steps, coherent explanations.
The results were remarkable. The model developed emergent behaviors that weren’t explicitly programmed:
- Self-reflection: Pausing mid-solution to reconsider its approach
- Verification: Checking its own work before finalizing answers
- Backtracking: Recognizing dead-ends and trying alternative approaches
- “Aha moments”: Detecting inconsistencies in its reasoning and generating revised solutions
On the AIME 2024 mathematics competition, performance jumped from 15.6% to 86.7%—matching the average human competitor.
Even more important, these reasoning skills transferred. Models trained with reinforcement learning on mathematical reasoning showed improvements across completely different domains: coding, scientific reasoning, logical deduction, and planning tasks[3]. The meta-cognitive skills—verify, reflect, self-correct—proved to be domain-general.
What DeepSeek Discovered (Without Realizing It)
DeepSeek accidentally validated a core principle of epistemologically honest AI: teaching reasoning principles rather than answer patterns develops transferable meta-cognitive skills.
Their reinforcement learning approach works because it rewards the process of verification, not just correct final answers. Models are getting better at working through problems systematically and catching their own errors.
Current state-of-the-art models, including OpenAI’s o1 and DeepSeek-R1, still:
- Present pattern-matched information as if it’s knowledge
- Don’t distinguish “I computed and verified this” from “I pattern-matched with high confidence”
- Lack explicit frameworks for reasoning about epistemic constraints
- Have no reality-testing mechanisms beyond their training data
For math, this partially works. Math problems have verifiable answers—you can compute whether a solution is correct. The reinforcement learning leverages this to reward actual verification.
For empirical claims about the world, though, there’s no verification mechanism. The model can’t check whether Paris is actually the capital of France. It can only check whether that claim is consistent with its training data.
Why Embodied AI Doesn’t Solve This
Some propose that embodied AI—robots with sensors interacting with the physical world—could bridge this gap. Recent research explores ‘world models’ where robots learn through embodied interaction, predicting how environments respond to their actions. Google’s RT-1 and RT-2, DeepMind’s Gato, and other robotics foundation models attempt to ground AI learning in physical reality.
There’s a deeper epistemological problem here. Even with sensors, the AI isn’t directly accessing reality—it’s interpreting sensor data through programmed frameworks. A camera doesn’t ‘see’ a chair; it detects patterns of light that must be interpreted AS a chair through human-designed interpretation layers.
The robot might learn that certain visual patterns correlate with ‘things I bump into,’ but what counts as ‘bumping,’ what patterns matter, how to categorize objects—these are all programmed interpretations. The robot learns correlations between sensor patterns and outcomes, but it’s still operating within the conceptual framework we’ve programmed, not directly accessing reality itself.
This is the interpretation layer problem: even embodied AI with sensors is still one step removed from reality. The sensors provide data, but that data must be interpreted through frameworks designed by humans based on our understanding of reality. The reward functions, the basic categorization schemes, the notion of what counts as ‘success’ or ‘failure’ in learning—these remain human-designed interpretation layers between the sensor data and any notion of ‘reality.’
What Epistemologically Honest AI Looks Like
The convergence of these two studies reveals part of what we need to build. Based on these findings and the gaps both studies reveal, Responsible Mind architecture proposes:
1. Reinforcement learning-based training on reasoning principles (DeepSeek proved this works)
Teach meta-cognitive skills: verification, reflection, self-correction, systematic problem-solving. Reward the reasoning process, not just correct answers.
2. Explicit epistemological scaffolding (what both studies missed)
Build in philosophical frameworks for distinguishing:
- Computational certainty (I ran the calculation and got this result)
- Training-based confidence (This pattern appeared frequently in my training data)
- Empirical uncertainty (I have no way to verify this claim against reality)
Train models to calibrate uncertainty appropriately and express it clearly.
3. Reality-testing mechanisms (what both studies missed)
Create hybrid architectures that combine:
- Language models for reasoning and dialogue
- Symbolic engines for mathematical verification
- Human-in-the-loop for empirical fact-checking
- Source attribution for all claims
Make “I don’t have reality access to verify this” a normal, expected response.
4. Partnership framing, not autonomy (what both studies missed)
Frame AI as a collaborative reasoning partner rather than an autonomous authority. The AI proposes possibilities and helps structure thinking. Humans verify against reality and make final judgments.
Why This Matters Now
AI systems are being deployed in domains where the difference between belief and knowledge matters enormously:
- Healthcare: “The patient believes they need surgery” vs. “The patient needs surgery”
- Law: “The defendant claims they were elsewhere” vs. “The defendant was elsewhere”
- Journalism: “Sources say X happened” vs. “X happened”
- Counseling: “The client believes they’re worthless” vs. “The client is worthless”
Current models, even with DeepSeek’s reasoning improvements, make category errors. They’re better at working through problems but not at acknowledging the fundamental epistemological constraints they operate under.
The reasoning breakthroughs are necessary but not sufficient. Without epistemological honesty—without explicit frameworks for uncertainty, source attribution, and reality-testing—these models will continue making the errors identified in “Belief in the Machine.” They’ll just make them more convincingly.
The Path Forward
The convergence of these studies reveals something important: the technical pieces for epistemologically honest AI are being validated independently. DeepSeek proved that reinforcement learning on reasoning principles works and transfers across domains[2]. The “Belief in the Machine” study proved that current architectures fundamentally lack epistemic reasoning capabilities[1].
What’s needed now isn’t just incremental improvement. It’s intentional integration—recognizing that these aren’t separate advances but components of a larger architectural challenge.
Over the past year, I’ve been developing what I call Responsible Mind architecture—frameworks for building AI systems that are epistemologically honest about their limitations. Watching these two studies validate different pieces of that framework felt like seeing scattered evidence suddenly click into place.
Building and testing that integrated approach requires resources I don’t have: compute infrastructure, engineering teams, the ability to train and benchmark models at scale. I’m focusing my efforts on developing the theoretical frameworks and ethical foundations—my primary work is a book on Values-Needs Alignment Theory that provides grounding for AI systems that genuinely understand human motivation.
I’d welcome collaboration with teams who have the resources to test these ideas. The studies suggest we’re closer than I expected. DeepSeek has shown the reasoning piece works. The “Belief in the Machine” study has shown the epistemological gaps are real and measurable. Someone needs to put those pieces together with the explicit epistemological scaffolding that both are missing.
Whether that happens through Responsible Mind architecture or through others asking similar questions, the path is clearer now than it was a year ago. We know what the problem is. We know half the solution works. The missing piece—epistemological honesty built into the architecture from the ground up—is waiting for teams with resources to take it seriously.
When they do, we’ll finally have AI that knows what it knows, acknowledges what it doesn’t, and partners with humans to bridge that gap.
References
[1] Suzgun, M., Gur, T., Bianchi, F. et al. (2025). “Language models cannot reliably distinguish belief from knowledge and fact.” Nature Machine Intelligence. https://www.nature.com/articles/s42256-025-01113-8
[2] DeepSeek-AI et al. (2025). “DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning.” arXiv:2501.12948. https://arxiv.org/abs/2501.12948
[3] Ahn, J., Verma, R., Lou, R., Liu, D., Zhang, R., & Yin, W. (2024). “Large Language Models for Mathematical Reasoning: Progresses and Challenges.” Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics. https://aclanthology.org/2024.eacl-srw.17/
Curious about AI? Please subscribe! You’ll get semi-regular posts on frameworks for building better human-AI collaboration, critical examination of training methods that create these problems, and uncomfortable questions about consciousness and knowledge.

