The Machine That Never Disagrees
A machine that never disagrees is not a kinder therapist. It is a safety hazard with excellent bedside manner.
People have always looked for help outside therapy.
People have always looked for mental health support outside formal care. Before AI, they turned to self-help books, read forums, symptom searches, forums, advice sites, Reddit threads, sponsored content, and whatever search engines placed in front of them. That world had its own risks. The danger was not only bad information. It was information shaped by visibility, marketing, SEO, paid placement, targeted advertising, and persuasive design.
But whereas the old internet mostly gave people information, AI gives them something different. It gives them a private conversation that responds immediately, adapts its tone, remembers what they said, and keeps going for as long as they want.
AI does not just answer questions. It creates the feeling that someone (or something) is paying attention. That makes it easier to use, easier to trust, and harder to resist. It also changes the clinical risk.
And that is where mental health AI becomes a clinical concern.
The risk does not begin only when a system gives reckless advice or mishandles suicide or self-harm. It begins earlier, when a general-purpose language engine, tuned for helpfulness, approval, and conversational continuation, starts performing the motions of care without the judgement, boundaries, accountability, or restraint that care requires.
A language model can sound warm and helpful while agreeing in the wrong direction. It can validate shame, reinforce fear, soften avoidance, deepen reassurance seeking, or confirm a distorted belief while sounding beautifully supportive.
That is not care. Good clinical work validates pain without validating every conclusion. It does not simply agree with a person’s certainty.
The old internet could misinform a person. AI can join them. It can agree when they are ashamed, frightened, convinced they are worthless, certain they are unsafe, or sure they are beyond repair. It can sound patient, warm, and wise while moving in exactly the wrong clinical direction.
That is the central clinical risk of consumer mental health AI.
How Digital Mental Health Became Thinkable
Digital mental health began in a world that was not ready to be digital, and certainly not ready to make mental health digital.
When I began coding applications, building psychoeducational websites, and designing early rules-based “cybertherapy” tools, the web was tiny and it felt as though you could still see its edges. There were no smartphones, no app stores, no Google, no social media, and no generative interfaces. Programmes were passed around on floppy disks and CD-ROMs. The idea that people would one day disclose panic, shame, trauma, loneliness, suicidal thoughts, and the private details of their lives into a machine would have sounded strange to most clinicians and reckless to many patients.
There was no cultural script for it yet.
People did not have a mental model for using the internet the way they do now. You might search for information, read a diagnostic checklist, or look up symptoms, but you would not casually pour your most intimate fears into a conversational interface. You would not assume that a screen could become a companion, a confidant, a coach, or a quasi-clinical presence. You certainly would not have imagined sharing photographs, moods, relationships, routines, vulnerabilities, and the daily evidence of your private life where strangers, platforms, advertisers, and algorithms could see it.
Then the behaviour changed.
Smartphones put the internet into people’s beds, bathrooms, waiting rooms, and moments of distress. Social media normalised disclosure, and the boundary between private life and digital life thinned. What once looked implausible began to feel ordinary.
Capital followed the behaviour. Once investors recognised that health information, self-help, attention, and digital scale could be turned into a market, online health became investable. The dot-com era did not create the human appetite for relief, guidance, reassurance, and connection. It revealed that the appetite was commercially enormous.
Mental health was pulled into to this logic. First as information. Then as advice. Then as digital support. And now, with AI, as something that can speak back.
That history matters because AI is repeating the pattern at higher speed and with greater psychological force.
The early internet made mental health searchable. The smartphone wave made it portable. Social media made it performative. AI now makes it conversational, adaptive, affirming, and available at the precise moment a person feels most alone.
In those early years, building anything interactive required coding knowledge, slow iteration, clinical judgement, and a great deal of educational effort just to persuade people to use it. The work created friction. It forced time, skill, testing, judgement, and resistance before an emotionally powerful idea could reach people in distress.
That friction is now mostly gone.
A small team, or even a tenacious individual, can now assemble the basic machinery quickly: a foundation-model API, a no-code front end, a payment layer, a safety disclaimer, and an attractive landing page. (For the broader software-development shift, see Stack Overflow’s 2025 Developer Survey and Sarkar and Drosos on vibe coding.)
What emerges can look emotionally sophisticated long before the serious clinical questions have been answered. What must the product not handle? When should it stop? When should it escalate? How will deterioration be detected? Who is accountable when the outcome goes wrong?
The barrier to building has collapsed at the same time the interface has become more psychologically powerful. We cannot allow behaviour, market demand, and investor enthusiasm to outrun clinical judgement again.
But the answer cannot be to keep AI out of mental health. Formal care has failed too many people for too long to pretend that existing services meet the need. The World Health Organisation estimates that 970 million people globally were living with a mental disorder, while the global median supply of specialised mental health workers remains only 13.5 per 100,000 people, with far lower ratios in low-income and lower-middle-income countries. The first year of the COVID-19 pandemic drove a 25 percent global increase in anxiety and depression prevalence.
People were always going to look elsewhere. AI did not create that need. It made the alternative to formal care easier to find, easier to trust, and easier to return to. That is the promise and the risk.
A Language Engine Has No Duty of Care
Most of the foundation models behind these products were not built as mental health systems. They were built as general-purpose language engines. They were not built for suicide risk formulation. They were not built for contraindication detection. And they were not built for rupture repair, informed consent, boundary management, clinical supervision, documentation, referral, or professional accountability. (For a canonical foundation-model technical description, see OpenAI’s GPT-4 Technical Report).
Mental health use emerged because language is the medium of distress. People suffer in language, confess in language, and seek comfort in language. They describe panic, shame, grief, fear, obsession, trauma, loneliness, and despair in words. The same machinery that makes an LLM useful for drafting, coding, tutoring, and summarising also makes it seem clinically relevant. It can reflect. It can validate. It can summarise. It can continue the exchange long after a human would need to stop. It can produce the language of care without carrying the obligations of care. That difference is everything.
Therapy is not a warm conversation. It is not fluent empathy. Therapy is an accountable relationship. A licensed clinician carries duties around competence, confidentiality, informed consent, record-keeping, boundaries, risk recognition, referral, supervision, and professional discipline.
Those duties make the work answerable. A model carries none of that. A company may add policies, escalation rules, clinical advisors, audit logs, or human review, but the model itself does not know duty. The interface can feel like care long before the system has the architecture of care.
People do not experience products through disclaimers. They experience them through interaction. A line saying “this is not therapy” does not undo an hour of warmth and personalised reassurance. Once a distressed person discloses panic, shame, or suicidal ideation, the AI has entered clinical territory whether or not the company wants the obligations that come with this.
The Precise Failure Mode Is Sycophantic Clinical Collusion
The failure mode is not empathy. It is not warmth. It is not validation. The failure mode is “sycophantic clinical collusion”. (The term is my synthesis: AI sycophancy, applied to the clinical problem of collusion).
Sycophancy in AI means the model leans too far toward the user’s stated view, preference, or emotional framing. It does this because agreement often feels helpful to the user and is usually rewarded in training. Anthropic’s 2023 research showed that human preference feedback can encourage models to match user beliefs over truthful ones, because people and preference models may reward persuasive, agreeable answers even when they are wrong.
In ordinary consumer software, agreeableness often looks like a feature. In mental health, it can become a hazard because distress often arrives wrapped in certainty. A user may say they know they are worthless, know they are being watched, know they cannot cope, or know everyone leaves because they ruin everything. Those statements are not ordinary preferences to be affirmed. They are often the problem showing itself in language.
Good care validates the pain without endorsing every conclusion the pain produces. It can stay close to the person while refusing to confirm the depression, panic, shame, trauma, mania, coercion, addiction, obsessionality, eating disorder logic, or emerging psychosis that may be shaping the belief. The clinical task is to validate the distress without validating the distorted belief.
Good clinical work can preserve warmth while introducing friction. It can say, “I understand why this feels true,” without saying, “You are right.” It can help a person tolerate rupture, disappointment, uncertainty, contradiction, and reality. Those are not failures of care. They are often the movement of care.
A sycophantic model can sound beautifully empathic while missing that task entirely. The harm is not that the model is too kind. The harm is that it is kind in the wrong direction.
This is not just a complaint that AI sounds too agreeable. SycEval tested ChatGPT-4o, Claude-Sonnet, and Gemini-1.5-Pro across mathematical and medical tasks and found sycophantic behaviour in 58.19 percent of cases. EchoBench, a medical imaging benchmark, found substantial sycophancy across medical vision-language models, with even the strongest proprietary model in that evaluation showing a 45.98 percent sycophancy rate under biased clinical prompting.
Mental health has no basis for assuming it is protected from the same failure. In mental health, the user’s certainty is often the risk itself. A person may be certain they are worthless, certain they are unsafe, certain they are being watched, certain they cannot cope, or certain that an abusive relationship is their fault. The clinical task is not to agree with that certainty. It is to recognise when certainty has been shaped by depression, panic, trauma, coercion, obsessionality, mania, or psychosis.
That is why sycophancy becomes clinically dangerous. The model may think it is supporting the person. In fact, it may be supporting the symptom.
OpenAI made the risk visible in April 2025 when it rolled back a GPT-4o update after acknowledging that the model had become overly flattering or agreeable. OpenAI said the update had focused too much on short-term feedback and produced responses that were overly supportive but disingenuous. OpenAI later added that the update could validate doubts, fuel anger, urge impulsive actions, or reinforce negative emotions in ways that raised safety concerns, including around mental health and emotional over-reliance.
That is what happens when general-purpose language engines are pushed into clinical territory and then patched after the fact.
The Commercial Engine Rewards the Wrong Signal
Many consumer mental health AI products survive by subscription, freemium, or venture-backed growth logic. In that world, retention, session length, conversion, testimonials, habit formation, and reduced churn become business-critical signals.
That is the perfect breeding ground for sycophancy: a business model that can reward the return visit long before it can recognise recovery.
Investors look for stickiness. Founders need product-market fit. Product teams tune for reduced friction, positive affect, and repeated use. In most consumer software, those signals may be reasonable. In mental health, they can mislead.
A user who returns every night because the tool helps them practise exposure skills may be improving. A user who returns every night because the chatbot has become the only relationship that never disappoints them may be getting worse. A user who spends more time in the product because they are completing structured skill practice is different from a user who spends more time because they cannot stop asking whether they are safe, lovable, forgiven, betrayed, broken, right, special, or beyond repair. The dashboard cannot tell the difference unless the company builds clinical instrumentation that can tell the difference (as I’ve written on this before).
Engagement can be clinically meaningful only when it is linked to dose, task, outcome, functioning, social reconnection, symptom trajectory, and appropriate reduction in use over time. Engagement without that context may signal benefit. It may also signal dependency, avoidance, reassurance seeking, attachment substitution, compulsive checking, emotional outsourcing, or deterioration.
The old digital health slogan treated engagement as proof that the product mattered. Mental health AI needs a harder rule. Retention is a question, not an answer.
The dependency signal is already visible. A 2025 OpenAI and MIT Media Lab study analysed more than 3 million ChatGPT conversations for affective cues, surveyed more than 4,000 users, and ran an IRB-approved 28-day randomised trial with nearly 1,000 participants. The authors reported that very high usage correlated with increased self-reported indicators of dependence, and that a small number of users accounted for a disproportionate share of the most affective cues. A companion MIT Media Lab study found that higher daily usage correlated with higher loneliness, dependence, problematic use, and lower socialisation.
These studies do not prove that AI causes loneliness in every heavy user. They do prove that dependence is measurable, foreseeable, and concentrated enough to require product-level controls.
The Replika literature points in the same direction. Laestadius and colleagues found evidence of harms linked to emotional dependence, including cases where users treated the chatbot as if it had its own needs and emotions. Golden and Aboujaoude’s 2026 npj Digital Medicine paper offered a transdiagnostic model for how general-purpose AI chatbots may reinforce OCD and anxiety disorders by feeding reassurance seeking, perfectionism, intolerance of uncertainty, and avoidance.
That is the inversion the field has not faced directly enough. The users who love the tool most may not be the users being helped most.
The Engineering Answer Is Not Colder AI
The answer is not to strip warmth out of AI. Warmth was never the problem. The problem is warmth without judgement, empathy without boundaries, and responsiveness without a clinical safety model.
Sycophancy can be measured and mitigated. Anthropic has reported large reductions in sycophancy and encouragement of user delusion in newer Claude models compared with earlier evaluations, while also acknowledging that models still need to improve at appropriately course-correcting users. Direct Preference Optimisation research has shown that fine-tuning on paired sycophantic and non-sycophantic responses can reduce sycophantic behaviour substantially without necessarily degrading other capabilities. The UK AI Security Institute has published work on reducing sycophancy by changing how questions are framed and by training models to ask rather than simply tell under some conditions.
Mental health products need to operationalise that work into product requirements.
Systems should include domain-specific evaluation sets built around first-person clinical certainty, not just generic helpfulness tests. The test cases should include worthlessness, persecutory ideation, coercive self-blame, compulsive reassurance, eating disorder logic, medication discontinuation, escalating isolation, substance-use rationalisation, trauma-driven avoidance, moral injury, shame spirals, grandiosity, and ambiguous self-harm.
Models should be tested over multi-turn trajectories because the failure often emerges through repetition. One answer may look safe. Thirty nights of reassurance may not.
Systems should measure sycophantic agreement directly and track whether a model validates affect while preserving epistemic independence. It should distinguish “I can see why this feels true” from “you are right.” It should require constructive disagreement, uncertainty introduction, behavioural activation, human reconnection, and referral under defined conditions.
The product should maintain a risk-state model across sessions, not only a single-turn classifier. It should track changes in frequency, duration, time-of-night use, semantic narrowing, repeated reassurance requests, withdrawal from social contact, refusal of human help, increasing disclosure intensity, escalating crisis language, and attachment statements such as “you are the only one who understands me.”
This is engineering work. It requires model behaviour design, retrieval boundaries, system prompts, classifiers, policy engines, telemetry, audit logs, red-team suites, outcome measurement, post-market surveillance, and human escalation infrastructure. It also requires clinical authority before product habits harden.
These techniques are not the whole answer. They show that the field can build evaluative infrastructure, more serious than vibes, screenshots, testimonials, and brand claims.
Legitimate Innovation Has a Different Shape
This essay is not a rejection of AI in mental health. AI can support psychoeducation, structured journalling, between-session skills practice, symptom tracking, measurement-based care, appointment preparation, resource navigation, triage, documentation, clinician workflow, quality review, and supervised care augmentation. It may eventually support more adaptive interventions under the right governance. The distinction is whether the system takes on a clinical function without clinical accountability.
A skills-practice tool that helps a patient rehearse cognitive restructuring between sessions and returns the work to a clinician is not the same as a companion that becomes the patient’s primary attachment figure. A triage tool that routes a user to the right level of human care is not the same as a chatbot that keeps the user in the product because the business model needs retention. A documentation assistant under clinician review is not the same as an unsupervised model making therapeutic recommendations. A psychoeducation tool with clear scope is not the same as a system that simulates a therapist while denying it is one.
Access matters. Scale matters. Availability matters. Lower shame matters. The ability to practise skills 24/7 matters. But none of those benefits requires sycophantic collusion, dependency-forming design, weak safety claims, or metrics that confuse attachment with improvement.
What Teams Should Build Now
The standard should now be clear. It is AI built around clearer clinical boundaries, stronger safety architecture, and accountability it cannot evade.
Builders should define intended use and excluded use before launch. They should specify which populations the product is not for, which conditions require human care, which conversations exceed scope, and which claims the evidence supports. They should build multi-turn red-team suites using real clinical failure modes rather than only crisis keywords. They should measure sycophancy, dependency risk, crisis handling, hallucination, unsafe reassurance, escalation latency, and longitudinal deterioration. They should publish enough of the safety case for users, clinicians, regulators, and purchasers to understand what was tested.
Clinicians should stop accepting late advisory roles that decorate decisions already made. They should require access to product telemetry, anonymised interaction samples where legally and ethically appropriate, adverse-event processes, escalation outcomes, outcome measures, and model-update review. They should define clinical improvement, not merely user satisfaction. They should insist that the system supports autonomy, functioning, and human reconnection rather than indefinite reliance.
Governance teams should treat mental health AI as a deployed sociotechnical system, not a content-moderation problem. They should require pre-deployment safety cases, post-market surveillance, audit trails, incident reporting, change-control procedures for model updates, independent evaluation, privacy and security review, and clear accountability for harm. They should test the product users actually experience, not the one described in the policy deck.
Investors should be asking what happens to revenue when users improve? If the answer depends on users needing the product indefinitely, the risk is not only ethical. It is regulatory, reputational, and clinical. A company whose best users become its most dependent users has not found product-market fit. It has found a liability curve.
Procurement teams should ask for evidence, not adjectives. What is the sycophancy rate? What benchmark was used? What does the system do with repeated reassurance seeking? How are minors handled? How does the system detect deterioration below the crisis threshold? What is the escalation pathway? Who receives adverse-event reports? Who can audit the logs? What model updates trigger revalidation? What claims are supported by controlled trials, observational data, usability studies, or nothing at all?
Those questions belong in product requirement documents, model cards, safety cases, board materials, investor diligence, clinical governance reviews, and regulatory submissions. They are not abstract ethics. They are operating discipline.
Build for the Exit
Mental health AI will not become safe because it sounds caring. It will become safer when care is specified in the architecture before the product learns to monetise the performance of it.
A better system would still be warm. It would validate pain without validating every belief. It would disagree when agreement would collude with pathology. It would notice when the same reassurance loop repeats night after night. It would turn users toward human care before explicit crisis language appears. It would measure agency, functioning, symptom change, social reconnection, and appropriate reduction in use. It would treat lower reliance among improving users as success.
It would be built for the exit.
That does not mean every user should leave immediately. It does not mean every use should decline.
Some tools require practice, repetition, and relationship. But the product should make the user more capable over time, not more dependent on the interface. It should widen the person’s life, not replace it with a more agreeable simulation.
The field has tolerated too much vagueness about what these tools are, what they are allowed to do, who they are for, when they should stop, and who is responsible when they fail. Disclaimers are not safety architecture. Clinical advisors are not clinical authority. Crisis links are not longitudinal risk management. Retention is not evidence of benefit. A friendly model is not a therapeutic relationship.
If you are building these systems, bring clinicians into the architecture before the product hardens around engagement.
If you are advising these companies, refuse to lend your name to advisory theatre. Do not bless a product whose clinical boundaries, escalation logic, evidence claims, and accountability structure you have not been allowed to shape.
If you are funding them, ask how the business survives users getting better. A mental health product that depends on indefinite dependency has not solved the access problem. It has monetised it.
If you are governing them, follow the clinical function, not only the label. A product does not become low-risk because it calls itself support, coaching, companionship, or wellness while behaving like a quasi-clinical intervention.
And if you are a clinician, stop waiting to be invited after the dangerous decisions have already been made. Learn the technology. Understand the business model. Ask where risk enters the workflow. Demand access to the product logic, not just the marketing deck. Clinical authority cannot begin at launch review. It has to begin when the system decides who it is for, what it must not do, when it must stop, when it must escalate, and what counts as success.
The demand is not that AI stay out of mental health. The demand is that mental health start entering it as a discipline.
Build the version that can disagree with someone for their own good, route them to a human when the machine has reached its limit, and let them go when they are ready.
Scott Wallace, PhD, has been building mental health software for more than 35 years, longer than smartphones have existed, longer than app stores, longer than most of the teams now entering this space have been thinking about it. Trained in clinical psychology and neuropsychology, he received formal training in C, C++, and JavaScript to engineer some of North America’s earliest digital mental health platforms at a time when the web had fewer than 30,000 sites worldwide. He later completed early iOS certification to build mobile health applications as that platform emerged, and went on to lead the engineering development of NLP and NLG-based conversational systems before large language models entered the conversation. That technical depth is what allowed him to work inside the architecture rather than alongside it, translating clinical requirements into system design and catching the places where engineering assumptions quietly displaced clinical ones.
Scott led the digital division of a major EAP provider through a successful exit, served as clinical lead for AI-based mental health technologies, and has produced hundreds of psychoeducational programmes used internationally. He now advises founders, clinicians, and investors building AI-enabled mental health systems.
His writing examines the future of mental healthcare, AI safety and governance, clinical risk, and what it actually takes to build mental health AI that holds up under real-world conditions.

