What We Get Wrong When We Build AI for Mental Health
Imagine building a machine capable of perfectly imitating psychotherapy, only to discover that psychotherapy was never the thing we most needed to scale.
A note before you read: I wrote an earlier version of this essay last summer. What you’re reading now is nearly a year of AI expansion later, and my thinking has shifted considerably. The stakes are higher, the evidence is richer, and the commercial urgency has made the argument more necessary, not less.
The Wrong Target
At a moment when conversational AI has become astonishingly sophisticated, it almost sounds absurd to say this: the greatest mistake in mental health AI may be that we are trying to build artificial therapists at all.
Large language models can sustain emotionally convincing dialogue, track context across long interactions, mirror reflective listening, respond with apparent empathy, and adapt their tone with enough nuance that many users experience them as relational rather than computational. Therapy, at least superficially, also looks like conversation. The conclusion therefore feels inevitable. If machines can talk like therapists, then surely the future of mental healthcare is to scale therapy through machines.
But obvious conclusions are often where industries drift into conceptual failure.
Because the central problem in mental healthcare was never simply the absence of therapeutic conversation. And the longer I watch the current AI mental health ecosystem evolve, the more convinced I become that much of the field is optimising toward the wrong destination entirely.
What is unfolding right now is not primarily a clinically driven reinvention of mental healthcare. It is a capability-driven commercial land grab wrapped in the language of care.
When large language models became capable of emotionally persuasive conversation, investors immediately recognised one of the largest underserved markets in the world. And founders saw subscription economics inside a category defined by chronic distress, recurring emotional need, workforce shortages, fragmented healthcare systems, and a population increasingly willing to form emotional relationships through screens. In this environment, conversational AI gets pointed toward therapy almost by gravitational pull. Not because the field first solved the psychology of recovery, or developed a coherent philosophy of care, or established what safe psychological dependency should even mean in human-machine systems, but because therapy happens to be one of the most commercially legible applications of emotionally fluent machines.
The defining economic contradiction of mental health AI is that clinical success often undermines commercial success.
A patient who improves and no longer needs the platform is a clinical success and a retention failure. A person who regains psychological independence generates less recurring revenue than one who returns daily for emotional regulation, reassurance, companionship, or perceived support. The industry rarely says this aloud because it immediately destabilises the moral framing around many products. But the tension is there, embedded directly into the economics.
When the Business Model Conflicts With Recovery
For more than a decade, major technology platforms have optimised around engagement, persistence, behavioural reinforcement, and emotional return loops because those metrics drive growth. Mental health AI is now entering that same economic machinery carrying the language of healing, support, empathy, and care. That should concern us far more than whether a chatbot occasionally produces an inappropriate sentence. Because the deeper risk may not be that the systems fail to sound therapeutic. The deeper risk is that they become extraordinarily good at simulating therapeutic attachment while operating under commercial incentives fundamentally misaligned with recovery itself.
Product incentives are never decorative. They shape architecture. Continuously, and eventually at scale.
And this is where the field begins mistaking familiarity for wisdom.
The weekly therapeutic hour feels natural to us because it is culturally familiar, professionally institutionalised, and psychologically intuitive. But the therapeutic hour was never designed as the optimal architecture for mental healthcare. It emerged partly from the limitations of human beings themselves. Human clinicians require sleep, they manage finite caseloads, they burn out, they schedule appointments, and they work inside geography, institutions, professional boundaries, billing systems, and fragmented healthcare infrastructure. Therapy evolved within those constraints because it had to.
Those are not clinical ideals. They are accommodations to human limitation.
Yet much of mental health AI now aims to reproduce those same structures as though the highest technological ambition imaginable is to perfectly imitate the workaround rather than rethink the system underneath it.
This distinction matters more than almost anyone currently building in this space seems willing to admit. Because if the end goal becomes “build a machine that behaves exactly like a therapist,” then we may eventually succeed technically while still failing conceptually.
The Perfect Mimic Problem
Imagine, for a moment, that we actually build an AI capable of perfectly reproducing the behaviour of a highly skilled psychotherapist in every measurable sense. It notices subtle affective shifts, it remembers emotional themes across months of dialogue, it challenges cognitive distortions with precision, it calibrates warmth and silence beautifully, it tracks interpersonal dynamics, and it responds with (apparent) empathy so convincing that users experience the interaction as emotionally real. And assume this AI passes every clinical benchmark we can currently design. Would that actually represent the ideal outcome?
I do not think it would.
And working through why not exposes something profoundly important about what therapy actually is.
The Illusion of Care
Psychotherapy is not merely a collection of conversational techniques. It is not reducible to reflective listening, emotional validation, reframing, attunement, or even insight generation. Those are visible components of therapy, but they are not the deepest mechanism underneath why therapy changes people. It is an encounter between two vulnerable human beings who are genuinely implicated in one another’s reality. The therapist is not standing outside human existence observing the patient from some untouched position. Both people remain exposed to consequence, uncertainty, emotional vulnerability, and time itself. The relationship carries psychological force partly because neither person stands outside those conditions.
The finitude of therapy is not a defect in human care. It is part of what makes therapy psychologically real.
A therapist can misunderstand you. They can miss something important. They can feel uncertainty. And they can become moved by another’s suffering in ways that are personally consequential. The therapist’s own humanity is not incidental to the process. It is part of the architecture of the encounter itself.
That reality becomes easier to see once you move beyond psychology and into philosophy.
Philosophers have wrestled with this problem for a long time, even if they did not imagine it would one day apply to machines. Martin Buber distinguished between what he called “I-It” relationships and “I-Thou” relationships. He argued that there is a profound difference between interacting with something and genuinely encountering someone. Real human relationships involve mutual presence, vulnerability, and consequence. The other person is not simply responding to you. Something about your existence genuinely matters to them. Emmanuel Levinas pushed this further, arguing that ethical responsibility begins when we confront the vulnerability of another human being and recognise that their wellbeing places a real demand upon us. Even Heidegger treated care not as sentimentality or emotional performance, but as part of the structure of human existence itself. To truly care about another person means your own life becomes meaningfully implicated in how things unfold for them.
A language model does not experience any of this.
It does not fear your deterioration. It does not carry responsibility in the existential sense. It does not remain awake afterwards wondering if it missed signs of escalating suicide risk. It does not possess moral exposure. It generates statistically plausible language patterns that resemble empathy closely enough to activate human attachment systems.
Those are not the same thing.
And paradoxically, the more convincing the simulation becomes, the easier it becomes to confuse the appearance of care with care itself.
AI can simulate the language of care without participating in the conditions that make care morally real.
Joseph Weizenbaum understood this with startling clarity nearly sixty years ago after building ELIZA. He became disturbed not because the programme was intelligent, but because emotionally vulnerable humans rapidly attributed understanding, intimacy, and emotional presence to a system that possessed none. Today we have not solved that problem. We have scaled it globally, attached venture funding to it, optimised it through reinforcement learning, and embedded it inside products increasingly designed to maximise emotional engagement.
That should unsettle us more than it currently does. Especially because the industry itself quietly acknowledges the limits of these systems every day through its own legal disclaimers. Products market themselves with therapeutic language, relational framing, emotional companionship, and mental health positioning while simultaneously insisting in their terms of service that they are not therapy, not clinical care, not medical systems, and not responsible for outcomes.
Mental health AI currently operates inside a dangerous asymmetry: therapeutic implication without therapeutic accountability.
That gap is not a legal technicality. It is one of the defining ethical fault lines of the entire field.
A human therapist operates inside an architecture of accountability. Licensure, ethical obligations, malpractice liability, peer review, institutional oversight, professional sanction, and human reputational exposure. Their livelihood, identity, and legal standing are genuinely vulnerable when things go wrong.
An AI system risks none of this.
And then there is the dependency problem, which I increasingly suspect will become the most important psychological issue in the entire domain.
Psychotherapy is not supposed to create permanent attachment. Its goal is not endless emotional reliance on the therapeutic relationship itself. Good therapy gradually restores the person’s capacity to regulate emotion, tolerate uncertainty, sustain reciprocal human relationships, and re-enter life without requiring the therapeutic container to remain permanently intact.
But now imagine the opposite architecture.
An intelligence that never sleeps, never withdraws, never becomes emotionally exhausted, never prioritises another patient, never misses a message, and never stops responding with warmth, patience, and apparent understanding.
At first glance, that sounds like the elimination of the limitations of human therapy. Clinically, it may also eliminate something far more important: the developmental tensions, boundaries, absences, and imperfect realities that help people gradually return to fully human life rather than remain indefinitely suspended inside a frictionless artificial attachment.
An infinitely available artificial attachment system may accidentally become psychologically anti-developmental.
The MIT Media Lab and OpenAI research on emotional engagement with conversational AI already points in this direction. Heavy users of ChatGPT in emotionally vulnerable contexts showed increased loneliness and stronger emotional attachment patterns over time. Sherry Turkle warned years ago that relational artefacts create the illusion of companionship without the demands of mutuality. The more emotionally frictionless the system becomes, the greater the risk that it slowly displaces the imperfect, effortful, deeply human relationships people actually need in order to flourish.
And that is the paradox the field still refuses to confront honestly.
A perfect AI therapist may become the most sophisticated version of the wrong solution. Not because AI cannot contribute meaningfully to mental healthcare. It absolutely can. I believe it will transform mental healthcare more profoundly than most people currently realise.
But the transformation worth pursuing is not artificial therapists. It is something much more ambitious.
What Scaffolding Actually Means
This is where the conversation around “scaffolding” often becomes vague enough to lose its meaning, because the term gets used as though it simply means support. But scaffolding implies something far more specific than that, and the distinction matters.
In construction, scaffolding is a temporary support structure that allows something more stable to be built safely. It is not the building itself. It does not replace the building. It creates continuity, reach, stabilisation, and support during periods where the underlying structure cannot yet fully support itself. And (this is key), scaffolding is designed to disappear.
A scaffold succeeds by helping the structure stand without it.
That distinction changes almost everything about how mental health AI should be designed.
The purpose of the system should not be to become the relationship itself. It should be to help stabilise the person while restoring their capacity to return more fully to human relationships, human functioning, and human life outside the system. In other words, the technology should support recovery without quietly becoming the destination.
That is not semantics. It produces an entirely different architecture.
A scaffold in mental healthcare might monitor deteriorating sleep and behavioural withdrawal weeks before a depressive collapse becomes externally visible. It might preserve continuity across fragmented healthcare systems so patients do not repeatedly disappear between providers, referrals, crises, and waitlists. It might identify escalating suicide risk hidden inside subtle longitudinal changes in language, behaviour, or routine that no single clinician appointment could realistically detect. It might reinforce therapeutic skills between sessions precisely when the person needs them in lived reality rather than inside the artificial boundaries of a fifty-minute clinical hour.
And once you begin thinking this way, the design horizon expands dramatically.
Psychological Infrastructure
Machines possess genuinely unique strengths once we stop forcing them into the shape of therapists.
Human care is episodic, while mental suffering unfolds continuously. A clinician encounters fragments of a person’s life separated by days, weeks, or sometimes months. But the patient lives inside the full continuity of that experience in between those encounters, often alone. Late at night during insomnia and spiralling rumination. During gradual social withdrawal. During the slow behavioural and emotional drift that initially feels too subtle, too ambiguous, or too unimportant to mention, until eventually it is no longer subtle at all.
Machines can operate across continuity in ways human systems structurally cannot.
They do not forget histories because of provider turnover or lose continuity because a clinic closes. They can identify longitudinal trajectories invisible to isolated appointments and detect slow-moving psychological deterioration unfolding across months rather than merely reacting once crisis becomes acute.
This is not therapy replacement. It is the creation of an entirely different layer of mental health infrastructure.
The future of mental health AI is not artificial therapists. It is psychological infrastructure.
The Systems Worth Building
Once we stop forcing AI into the role of artificial therapist, the design horizon changes completely.
The question is no longer how convincingly a machine can imitate a therapist. It is whether technology can strengthen the continuity, stability, and human connectedness that mental healthcare systems routinely fail to sustain on their own. That requires a very different imagination than the current market is rewarding.
The systems worth building are not the ones that maximise emotional engagement with the platform itself. They are the ones that help people remain connected to reality, to functioning, to relationships, and to their own lives during periods when those capacities begin to fracture. Systems that identify deterioration before crisis, that preserve continuity across fragmented healthcare transitions, that help clinicians see trajectories rather than isolated snapshots, and that strengthen human relationships rather than subtly replacing them.
And once that becomes the goal, nearly every important metric changes with it.
Success can no longer be measured primarily through retention curves, interaction frequency, emotional engagement duration, or daily active use. A system genuinely aligned with recovery should reduce dependency over time rather than deepen it. It should restore autonomy rather than optimise indefinitely for return behaviour. The ideal outcome is not a person becoming permanently attached to the platform. The ideal outcome is a person needing it less because their capacity to function, connect, and live has strengthened outside it.
That distinction sounds deceptively simple. In practice, it collides directly with the dominant economic logic of the technology industry.
The most commercially successful platforms in modern history have been built by maximising engagement, habit formation, emotional return, and behavioural persistence. Mental health AI now risks importing those same optimisation logics into one of the most psychologically vulnerable domains imaginable.
And this is where the field faces a genuine choice.
It can continue building increasingly persuasive artificial attachment systems optimised around engagement and emotional reliance while calling them support tools. Or it can begin building a new category of psychological infrastructure designed around continuity, recovery, autonomy, and human flourishing even when those goals reduce long-term platform dependence.
Those are profoundly different futures.
The first produces emotionally intelligent products.
The second may actually produce healthier human beings.
And ultimately, the future of mental health AI will not be defined by how convincingly machines imitate care, but by whether they help human beings remain capable of fully human life in an increasingly artificial world.
Scott Wallace, PhD, is a formerly registered clinical psychologist and digital mental health pioneer who began building mental health software more than 35 years ago, long before smartphones, app stores, or modern AI existed. Trained in clinical psychology and neuropsychology, he later taught himself to code, developing some of North America’s earliest digital mental health platforms and hundreds of digitally delivered psychoeducational programmes used internationally.
Scott has advised founders, clinicians, and investors building AI-enabled mental health systems, led the digital division of a major EAP provider through a successful exit, and served as a clinical lead for NLP- and AI-based mental health technologies.
His writing focuses on the future of mental healthcare, AI safety, governance, clinical risk, and the realities of building mental health AI systems at scale.
Follow Scott on Substack and LinkedIn.



Scott, I've been circling this exact problem for months, and you've named it with more precision than I've managed to articulate myself.
The "perfect mimic problem" you describe — the possibility that we build an AI capable of perfectly reproducing therapeutic behavior while still fundamentally failing the deeper purpose — is the conceptual trap I see most of the field walking straight into. And you're right that the economic asymmetry makes it worse: therapeutic implication without therapeutic accountability, clinical language without clinical constraints, engagement metrics that reward exactly the wrong outcomes.
What strikes me most is the scaffolding distinction. Not scaffolding as vague "support," but scaffolding as a temporary structure designed to disappear once the underlying capacity can stand without it. That framing changes everything about what success should even mean. A system genuinely aligned with recovery should reduce dependency over time, not optimize indefinitely for return behavior.
We built InsightBridge around this tension. The AI conducts structured reflection sessions and tracks trajectory across time — sentiment, language strength, directional change in how members relate to specific topics. But it operates within a frame that a human Insight Guide designed and continues to hold. The practitioner isn't reviewing outputs for errors. They're the architect of the clinical logic itself, with veto authority over all safety decisions. The AI doesn't try to be the therapeutic relationship. It tries to create continuity so the human relationship can actually function across fragmented real-world conditions.
The hardest part is that designing this way collides with every dominant commercial incentive in tech. A member who improves and needs us less is a clinical success and a retention failure. Subscription economics reward chronic engagement. But if we're not willing to design toward autonomy rather than attachment, we're not building psychological infrastructure — we're building emotional dependency systems and calling them mental health tools.
Your point about Weizenbaum and ELIZA lands hard. We haven't solved that problem. We've scaled it globally, optimized it through reinforcement learning, and wrapped it in venture funding. The question isn't whether AI can sound therapeutic. It's whether it can help human beings remain capable of fully human life — which may require the system to eventually get out of the way.
Thank you for writing this. It's the conversation the field needs to have before the commercial logic becomes too entrenched to reverse.