In 1894, Johns Hopkins surgeon William Stewart Halsted published the results of his “complete operation” for breast cancer — an en-bloc amputation of the breast, both pectoral muscles, and the axillary lymph nodes — and reported that it had cut local recurrence from the 51–82% rates of his European contemporaries to a fraction of that; the gap between that local-control victory and the survival it never delivered is the entire case. The radical mastectomy controlled the wound bed and was mistaken, for three-quarters of a century, for a control of the disease. It was performed on the order of nine in ten American women with breast cancer well into the 1970s, left them with a hollowed chest wall, a frozen shoulder, and near-ubiquitous arm lymphedema, and — as randomized trials would eventually show — bought not one additional day of survival over far lesser surgery.
The operation did not fail because it was crude. It was, by the standards of 1894, a genuine advance: Halsted’s en-bloc dissection and his obsession with surgical technique made him one of the founders of modern American surgery, and the early survival figures — a five-year survival roughly double that of untreated women — were real. The error was theoretical. Halsted built the operation on an anatomical hypothesis: that breast cancer spread in an orderly, centrifugal, contiguous fashion outward from the breast through the lymphatics, so that cutting wider and deeper must, by geometry, cut ahead of the disease. If the theory were true, more radical surgery would mean more cures. The theory was false.
Cancer that had spread had usually spread through the bloodstream before the surgeon ever arrived, and cancer that had not spread was cured by far less. The radical mastectomy’s mutilating margins therefore changed the scar without changing the outcome. Critics — Geoffrey Keynes in England, George “Barney” Crile Jr. at the Cleveland Clinic — argued this from the 1930s and 1950s and were dismissed by a surgical establishment that treated the Halsted operation as settled doctrine.
The reckoning came from a randomized trial run by a surgeon who had once performed the operation himself. Bernard Fisher’s NSABP Protocol B-04, begun in 1971, randomized 1,665 women among radical mastectomy and two lesser procedures; B-06, begun in 1976, added lumpectomy. At every follow-up out to 25 years, survival was statistically identical. The Halsted hypothesis of contiguous spread was replaced by the systemic-disease model — that breast cancer is, at diagnosis, often already a whole-body problem the scalpel cannot outrun. The radical mastectomy was not banned; it was abandoned, retired by evidence as the textbook case of a mutilating operation sustained for 75 years by an elegant theory that happened to be wrong.
In July 2002, orthopedic surgeon J. Bruce Moseley and a Houston Veterans Affairs team reported in the New England Journal of Medicine that 180 patients with osteoarthritis of the knee, randomized double-blind to arthroscopic débridement, arthroscopic lavage, or a sham operation in which surgeons made skin incisions but inserted no instrument, had identical outcomes — and the gap between that finding and a decade of confident practice is the entire case. By 2002 the scope-and-clean operation for the arthritic knee was being performed on the order of 650,000 times a year in the United States at roughly $5,000 apiece, a multi-billion-dollar standard of care, on the mechanistic premise that flushing out debris and trimming frayed cartilage relieved pain. The trial showed it relieved nothing the placebo did not.
The harm here was not a body count of deaths but of unnecessary operations: hundreds of thousands of patients each year underwent a real surgery — anesthesia, incisions, infection risk, recovery, deductibles — to obtain a benefit indistinguishable from being wheeled into an operating room, cut, and sewn shut. At no point over two years of follow-up did either intervention group report less pain or better function than the sham group; the 95 percent confidence intervals excluded any clinically meaningful difference. The wonder of arthroscopy had been real for torn menisci and loose bodies, but for arthritis pain it was theater.
What makes the episode an exemplar of withdrawal is that it was killed by the right kind of evidence. Surgery had long been treated as exempt from the placebo-controlled standard demanded of drugs, on the assumption that an operation cannot ethically be faked. Moseley’s team did precisely that — and the result was so clean that the Centers for Medicare and Medicaid Services moved within a year to defund the procedure for osteoarthritis. A 2008 Canadian trial led by Alexandra Kirkley confirmed that arthroscopy added nothing to optimized physical and medical therapy, and by 2017 international guideline panels were issuing strong recommendations against it. The operation was never recalled or banned. It was disconfirmed, defunded, and abandoned — a textbook demonstration that a popular surgery can be a placebo, and that without a sham control no one would have known.
In 1987 a team led by neurologist Olle Lindvall and neuroscientist Anders Björklund at Lund University, Sweden, began implanting dopamine-producing cells dissected from aborted human fetuses into the brains of Parkinson’s patients; the open-label results of the 1990s — surviving grafts on PET, patients walking who had been frozen — were celebrated as the first biological cure for a neurodegenerative disease. The gap between that promise and the controlled evidence is the case. Tested the way a drug would be — against sham brain surgery, double-blind — the graft did not beat placebo on its primary endpoint and inflicted a new, largely untreatable harm: persistent involuntary movements that ran on after every drop of levodopa was withdrawn.
Both trials that ended the era were funded by the U.S. National Institutes of Health and built around a placebo arm earlier enthusiasts had called unnecessary. In Curt Freed’s Denver–Columbia trial, published in The New England Journal of Medicine on March 8, 2001, 40 patients aged 34 to 75 were randomized to a fetal-tissue graft or to sham surgery — burr holes drilled, no cells implanted. The graft showed no benefit on the pre-specified global rating; a positive signal appeared only in a post-hoc subgroup aged 60 or younger. Then came the harm: dystonia and dyskinesias in roughly 15 percent of grafted patients (5 of 33), persisting after levodopa was reduced or stopped. The second NIH trial, run by neurologist C. Warren Olanow and published in Annals of Neurology in September 2003, deepened the failure: across 34 patients, no significant effect on the motor UPDRS (p = 0.244) at 24 months, 56 percent with off-medication dyskinesia, and a conclusion that transplantation “currently cannot be recommended as a therapy.”
The case is exemplary because the grafts worked biologically and failed clinically. Fluorodopa uptake rose; dopamine neurons survived robustly and were confirmed at autopsy. The cells lived — but thriving grafts drove a runaway, unregulated release of dopamine the brain could not modulate, leaving a procedure that could not be titrated, withdrawn, or reversed: a worse failure mode than the disease it meant to cure. The field abandoned routine fetal grafting and turned to the problem it had skipped — proving, against placebo, that putting cells in a brain helps the person attached to it.
Between roughly 1989 and 2002 American oncologists put an estimated 30,000–40,000 women with breast cancer through high-dose chemotherapy with autologous bone-marrow or stem-cell rescue (HDC/ABMT) before a single randomized trial had shown it saved lives; when the trials reported in 2000, the gap between promise and result was total. The regimen — massive cytotoxic doses that destroyed the marrow, followed by reinfusion of the patient’s banked cells to keep her alive — offered no survival benefit over conventional-dose chemotherapy, killed a meaningful fraction of patients through treatment-related toxicity, and cost an estimated $3.4 billion to deliver. The one study claiming a dramatic advantage was found to be fraudulent.
The procedure was never FDA-approved as a breast-cancer cure and never validated by a controlled trial during its boom. It spread on a seductive mechanistic story — breast cancer was dose-responsive, so more poison meant more cures — and on a litigation campaign that turned insurers’ refusal to pay into public scandal. The 1993 Fox v. Health Net verdict, awarding a dead schoolteacher’s family $89 million including $77 million in punitive damages, taught every HMO that denying the transplant was costlier than paying for it; coverage cascaded and four state legislatures mandated it.
The keystone of clinical belief was the work of South African oncologist Werner Bezwoda of the University of the Witwatersrand, whose trials alone reported a roughly three-fold survival advantage; when four randomized trials were presented together at the 1999 American Society of Clinical Oncology (ASCO) meeting, the other three showed no benefit. U.S. auditors who reached Johannesburg in early 2000 found his randomization existed only on paper and his control group had never received the standard treatment he reported. He was fired for “scientific misrepresentation” in 2000 and the Journal of Clinical Oncology retracted his work in 2001. HDC/ABMT for breast cancer is now the textbook case of an unproven, lethal intervention scaled to tens of thousands by hope, courtroom pressure, and a single fraud — abandoned not because it was banned, but because the evidence it had skipped finally arrived and demolished it.
In 1920 the Chicago obstetrician Joseph Bolivar DeLee, in a paper titled “The Prophylactic Forceps Operation,” urged physicians to cut the perineum of laboring women as a routine to spare them the worse damage of a ragged spontaneous tear — and the gap between that protective promise and the eventual evidence is the entire case. By the late twentieth century the operation DeLee reasoned his way into was one of the most common surgical procedures performed on American women, done on the order of a third of all vaginal deliveries (60.9% in 1979) and on a clear majority of first-time mothers, almost none told there was no trial behind it.
The justification was intuitive: a clean, controlled incision must heal better than a jagged laceration, and a pre-emptive cut must protect the pelvic floor against future prolapse and incontinence. The intuition was wrong in the most consequential way. When the procedure was finally tested against the comparator it had skipped for decades — selective use, cutting only on indication — the routine cut did not prevent severe trauma. A midline episiotomy extended the wound straight toward the anal sphincter and rectum, so the prophylactic incision was itself causally linked to the third- and fourth-degree tears it was meant to forestall.
The reckoning was slow because the practice was entrenched, not because the data were ambiguous. A 1983 interpretive review of more than 350 sources spanning 1860 to 1980 found no defensible evidence for routine use; the 1993 Argentine Episiotomy Trial, a randomized study of 2,606 women, showed routine use conferred no benefit and more harm; and the 2005 AHRQ-commissioned systematic review in JAMA closed the question, finding routine episiotomy improved no immediate outcome and prevented no incontinence or prolapse. In April 2006 the American College of Obstetricians and Gynecologists issued Practice Bulletin No. 71, recommending the routine be restricted. The procedure was not banned — it retains narrow, evidence-based indications — but its eighty-year career as a default was abandoned. It stands as obstetrics’ cleanest case of a plausible, near-universal intervention adopted on reasoning and reversed only by the trial that should have come first.
On October 30, 1967, in Zurich, the neurosurgeon M. Gazi Yaşargil sutured a scalp artery to a cortical branch of the middle cerebral artery under the operating microscope, rerouting blood around a blocked vessel to feed a starving brain; the operation was elegant, technically dazzling, and — for the prevention of stroke in patients with carotid and middle-cerebral disease — almost entirely unproven, and that gap between surgical beauty and clinical benefit is the entire case. For nearly two decades the extracranial-intracranial (EC-IC) arterial bypass spread on the strength of its own plausibility and on case series reporting open grafts, until a single randomized trial showed it prevented nothing it claimed to prevent.
The operation was never a fraud and never a mass killer in the lobotomy sense. It killed and disabled quietly, at the margins: a procedure with a roughly 12 percent thirty-day rate of stroke or death imposed up front on patients who, the trial would show, were no better protected afterward. The surrogate that sustained it was graft patency — the bypass stayed open in about 96 percent of cases, a number surgeons and angiograms could see and celebrate. A patent vessel looked like a prevented stroke. It was not the same thing, and conflating the two is the mechanism that kept the operation alive.
The reckoning arrived not from a regulator or a court but from an eight-year, NIH-funded randomized controlled trial led by the Canadian neurologist Henry J. M. Barnett of London, Ontario. Published in the New England Journal of Medicine on November 7, 1985, the International EC/IC Bypass Study randomized 1,377 patients at 71 centers in 14 countries and found that surgery added to best medical care did not reduce fatal or nonfatal stroke; two subgroups — patients with severe middle-cerebral stenosis and those with persisting symptoms after carotid occlusion — actually fared worse with the operation. Within a few years the procedure collapsed from a flourishing subspecialty to a narrow, rarely-indicated salvage technique. It was not banned. It was disconfirmed, and it became one of medicine’s foundational lessons in why a trial must precede an operation, not follow it.