CBT Research Doesn’t Show What You Think It Does
When science becomes public relations.
According to a recent article in the Journal of the American Medical Association, “CBT is effective in treating mental disorders.” That’s misleading. About 75% of people who receive CBT for depression—the most studied disorder—don’t get well.
Most people think “effective” means getting well and staying well. But the research shows only that the average patient does better than the control group. That’s very different.
No one goes to therapy to do better than a control group. They go to get well, or at least meaningfully better.
There’s a sleight of hand between the question that matters and the answer we’re given:
Researchers answer the question, “Do people do better than the control group?”
They discuss it publicly as if they’ve answered the question, “Do people get well?”
This is the kind of thing that makes people skeptical of “experts.” They may not grasp all the technicalities but they sense they’re not getting the whole story.
Things get distorted—seriously distorted—when complex research findings are reported to clinicians and the public. Let’s unpack it.
No one goes to therapy to do better than a control group.
First, the research is not designed to answer the question, “Do people get well?” The research yields quantitative findings—tons of them. Without professional-level expertise in both statistics and psychotherapy research, no one could make sense of them. One table in the JAMA paper contains 440 numeric values. That’s one table out of five, in one article alone.
Few outside this specialty within a specialty would even know where to find the relevant information, let alone what it means. Almost everyone has to rely on someone else to interpret the findings, and the interpreters have an agenda.
The research doesn’t show that CBT is effective, works, or is the “gold standard” (whatever that means).
Those terms are interpretations grafted onto arcane, complex data to give a neat takeaway.
The real research findings
Let’s zoom in on one of the five tables in the JAMA paper (Table 2) and just one of its 440 numbers. This may feel obscure, but that’s the point.
The number is .74.
That may be the single best answer to the question, “Is CBT effective for depression?”
It doesn’t mean much to non-experts, so I’ll add: the number is an effect size called Hedges g.
But you still don’t know whether people get well.
“I’m not a statistician,” you protest. “If someone tells me what ‘Hedges g’ is, I could draw my own conclusion.”
Hedges g = .74 means that on a “bell curve,” the CBT group improves about three-quarters of a standard deviation on a symptom scale, relative to a control group that gets no treatment for depression.
Maybe you’re now wondering why the studies don’t provide answers in numbers people understand, like remission rates for cancer, or pounds or kilos for weight loss.
We can’t measure depression in pounds, but researchers could still present results in plain terms. They could say, for example, “people who get CBT see a 15% reduction in depression symptoms compared to a control group.” That number is hypothetical—but notice how much clearer the statement is.
Wouldn’t that be more informative than just reporting Hedges g, or using buzzwords like effective, works, or evidence-based?
If you’re asking why therapy researchers don’t provide that information—good question. Here are three answers.
First: Researchers ask the wrong questions
The problem isn’t your lack of statistical knowledge. It’s that the researchers themselves often don’t know whether the treatments they study offer meaningful, real-life benefits.
World-renowned CBT researcher Alan Kazdin put it bluntly in the flagship journal of the American Psychological Association:
“Researchers often do not know if clients receiving an evidence-based treatment have improved in everyday life or changed in a way that makes a difference.”
“It is possible that evidence-based treatments with effects demonstrated on arbitrary metrics do not actually help people, that is, reduce their symptoms and improve their functioning.”
Therapy researchers work in academic silos where incentives don’t align with clinical reality.
Even if you mastered every statistic, you wouldn’t necessarily know whether the patients get well.
Doesn’t it seem a little odd to use the term “evidence-based” in the same sentence that acknowledges the treatments may not actually help people?
Second: The treatments don’t work for most
The treatments don’t work—at least not in the plain-English sense of the word.
In research trials, about 75% of people who get brief CBT for depression don’t improve, or relapse within months.
About 75% of people who get brief CBT for depression don’t improve, or relapse within months.
Even that overstates the benefits, because almost half of the patients in the control group also get well—without any treatment for depression. Do the math and you see: only about one in ten people gets well who wouldn’t have gotten well anyway.
Maybe some experts can, in good conscience, call that effective. I can’t.
These numbers come from the very same research articles people cite when they claim that CBT is “effective” for depression.
For a deeper dive, I wrote a journal article explaining why there’s such a gap between what we’re told research shows and what it actually shows.
Third: They’re not studying real psychotherapy
This last point goes to the very heart of the misaligned academic incentive system:
I don’t believe the therapy researchers are studying actual psychotherapy. They’re studying fictions of their own invention and calling it therapy.
By the time people get to a psychotherapist, they’re usually dealing with long-standing, ingrained problems. They don’t wake up depressed on Monday and call a psychotherapist on Tuesday. Psychotherapy is generally the last resort, not the first.
And for these people—the ones we treat in the real world—here’s what’s true on average: real change in psychotherapy begins at about six months and continues for one to two years.
Those are realistic durations for psychotherapy. For a deeper dive, read my article The Tyranny of Time, which explains how long psychotherapy really takes.
But researchers don’t study treatments of realistic duration. Nearly all the research trials focus on very brief treatments of only 8-12 sessions. Treatment is over before meaningful psychological change begins. These abbreviated treatments are a researcher’s fantasy; they bear little resemblance to psychotherapy in the real world.
This is why I say researchers are studying fictions of their own invention.
It comes down to truth in advertising: if words like “effective” and “evidence-based” don’t mean real and lasting change for real-world patients, then maybe we should stop using them.
You should not fool the layman when you’re talking as a scientist.
—Richard Feynman, Cargo Cult Science
References
Cuijpers, P., Harrer, M., Miguel, C., et al. (2025). Cognitive behavior therapy for mental disorders in adults: A unified series of meta-analyses. JAMA Psychiatry, 82(6), 563–571.
Cuijpers P, Karyotaki E, Ciharova M, et al. (2021). The effects of psychotherapies for depression on response, remission, reliable change, and deterioration: a meta-analysis. Acta Psychiatr Scand. 2021; 144: 288–299.
Feynman, R.P. (1974). Cargo cult science. Commencement address, California Institute of Technology.
Kazdin, A. E. (2006). Arbitrary metrics: Implications for identifying evidence-based treatments. American Psychologist, 61(1), 42–49.
Shedler, J. (2018). Where is the evidence for “evidence-based” therapy? Psychiatric Clinics of North America, 41(2), 319-329.
Shedler, J., & Gnaulati, E. (2020). The tyranny of time: how long does effective therapy really take? Psychotherapy Networker. Retrieved from https://www.psychotherapynetworker.org/article/tyranny-time/
Read my companion post: The “Evidence-Based Therapy” Game Explained with Pancakes
More essays, interviews, clips, and reflections: linktr.ee/jonathanshedler


Add in the fact that many control. Groups are meaningless and do not reflect real life. Wait list is one of the worst examples
But there are more as well. If you take the worst case control group and compare any kind of social talk I think people would to some degree improve.
As I tell medical students, meta-analyses are magic math. Here’s an example:
“The Egger test indicated significant asymmetry of the funnel plot for MDD, SAD, GAD, PTSD, BED, and BIP but not for PAN, PHOB, OCD, BN, and PSY (eAppendix 17 in Supplement 1). Adjustment for publication bias through Duval and Tweedie’s “trim and fill” procedure resulted in smaller SMDs in all disorders (except for PHOB) and suggested that 20% of studies were missed.”
In plain English, there was a lot of publication bias (“significant asymmetry”). This should have prompted a search for unpublished studies (grey literature), however, there’s no evidence that occurred (it requires medical librarians and a lot of time). Instead, some magic math to account for the missing studies was applied
Meta-analyses assume the component studies are simple replications (like a college chemistry assignment where you run the same protocol several times). But they never are, which leads to heterogeneity:
“Heterogeneity was modest (I2 <50%) for BIP and OCD, high (50%-75%) for PAN, BN, BED and PSY, and very high (>75%) for the other 5 disorders.”
It goes on and on