Systematic reviews of clinical trials: the early days
In the mid-1970s, while one of us (DR) was deputy editor of the New England Journal of Medicine, he helped to handle a paper by a group led by Thomas (Tom) Chalmers (Chalmers F 2009). This reported their use of a systematic approach to identifying, assessing, and synthesizing the results of controlled trials of anticoagulants in patients with myocardial infarction (Chalmers TC et al 1977). DR remembers well how it seemed to settle at one blow an argument that had raged for decades. The analysis showed how methods could be used to synthesize the results of separate but similar studies to provide more scientifically robust estimates of the direction and size of treatment effects.
A decade later, Cynthia Mulrow (1987) showed that reviews published in the major general medical journals had usually ignored basic scientific principles (Huth 2008). She and others in the late 1980s, including Tom Chalmers and his colleagues (Sacks et al. 1987; Chalmers TC et al. 1987a; Chalmers TC et al. 1987b), suggested standards to decrease bias and random errors in medical reviews. These included calls for full descriptions of the methods used to search for articles, criteria for inclusion and exclusion of studies, and the statistical methods used to achieve quantitative synthesis of data from separate studies, a technique that had been dubbed ‘meta-analysis’ by an American social scientist a decade earlier (Glass 1976). Throughout the 1980s, Tom Chalmers and his colleagues made substantial contributions to the application, in practice, of these new methods for reviewing evidence from medical research (for example, Baum et al. 1981; Sacks et al. 1985; Himel et al. 1986; Chalmers TC 1988; Chalmers TC et al. 1988a; Chalmers TC et al. 1988b; Longnecker et al. 1988; Sze et al. 1988; Hine et al. 1989).
Improving methods for research synthesis had become a necessity. This was not only because it made no scientific sense to base conclusions on informal analyses of potentially biased ‘convenience samples’ of studies; but also because health professionals could not be expected to cope with the unmanageable volume of studies of potential relevance to their practice. As exemplified by the articles published by Tom Chalmers and his colleagues, the 1980s witnessed increasing use of these methods in medicine (Chalmers I et al. 2002). Usually, reports addressed the effects of a particular treatments, for example, for myocardial infarction (Stampfer et al. 1982; Yusuf et al. 1985; Antiplatelet Trialists’ Collaboration 1988), or breast cancer (Stjernswärd 1974; Himel at al. 1986; Early Breast Cancer Trialists’ Collaborative Group 1988). In one sphere – care during pregnancy and childbirth – efforts were made to identify, assess and make sense of all of the controlled trials that could be identified. Importantly, from 1988 onwards, the new medium of electronic publication was exploited to update these analyses cumulatively as new evidence became available (Chalmers I et al. 1993).
Demonstrating the dangers of unscientific medical review articles and textbooks
Throwing out a venerable system of expert reviewing was a radical idea. But it could scarcely have been adopted unless it had been demonstrated that a real problem existed, with important implications for the wellbeing of patients. On 8 July 1992, after DR had become deputy editor of the Journal of the American Medical Association (JAMA), he handled another paper by Tom Chalmers and colleagues (Antman et al. 1992). The paper showed that traditional review articles and textbooks had often given treatment advice that was dangerously inconsistent with the evidence available at the time they had been written.
What was the background to this important study? In the late 1980s, the two senior authors – Thomas Chalmers and Frederick Mosteller (Petrosino 2004) – had joined forces as co-directors of a small Technology Assessment Group housed in the basement of the Harvard School of Public Health. Both of them had been involved in the early development of controlled trials and in pioneering systematic approaches to synthesizing evidence from separate but similar studies. The distinct and important contribution made by the analysis reported in JAMA in 1992 was that it provided clear evidence that the old system of reviews simply did not work, at least as far as treatment for myocardial infarction was concerned.
The authors’ comparisons of the recommendations of clinical experts writing reviews and book chapters over a period of 30 years with what could have been known had the experts used systematic reviews and meta-analysis made clear that effective as well as dangerous treatments had been overlooked. For example, thrombolytic drugs “did not begin to be recommended even for specific indications by more than half the experts until 13 years after they could have been shown to be effective.”…. In 1992, seven years after “ an approximately 20% reduction in death was established at the P<0.001 level (OR, 0.78; 95% CI, 0.69 to 0.90), 14 reviews did not mention the treatment or felt it was still experimental.” Antiplatelet drugs “did not begin to be recommended for routine use by more than half the reviewers until 1986, 10 years after they could have been shown to be effective by cumulative meta-analyses, and 6 years after the first published meta-analysis.” Type 1 antiarrhythmic drugs were found to have statistically significant adverse effects on mortality, and serious doubt was cast on the safety of calcium channel blockers. The authors concluded by calling for more timely reviews and the “dissemination of clinical trial results in a format that will facilitate better published clinical guidelines.”
Tom Chalmers was the corresponding author for the article, and its publication was surrounded by confusion and some ill-will. The coincidence of topic and content with a paper that appeared in the New England Journal of Medicine two weeks later (Lau et al. 1992) was an unpleasant surprise to the editors of both journals. Tom Chalmers had implied to JAMA’s editors that the other manuscript, which he called “a description of the cumulative meta-analysis methodology”, had been sent to a specialized statistical journal. Because of personal trust the JAMA editors never asked him for further clarification, but readers accused the New England Journal of Medicine of duplicate publication (Federman and Mutgi 1992). Looking back 17 years, after the dust has settled, the editors at JAMA explained Tom Chalmers’ dodgy behaviour by one of his most notable characteristics – relentless competitiveness.
The JAMA article rapidly became a citation classic: at the time of writing this commentary it had been cited 680 times (Eugene Garfield, personal communication). Its findings featured prominently in the published material, oral presentations and discussions promoting the mission of the Cochrane Collaboration (Chalmers 1993). For example, one of us (IC) was summoned to give evidence to a House of Lords Committee on medical research, and drew on the paper’s findings. The Committee was informed that, five years after a systematic review of controlled trials had shown that thrombolysis reduced the risk of death after myocardial infarction, the Oxford Textbook of Medicine maintained that the benefits of the treatment had not been established (Pentecost 1987). The following weekend (5 Feb 1995), this contribution to the Committee’s thinking led The Sunday Times to publish an article on its front page under the headline ‘Hundreds killed by doctors relying on outdated manuals,’ which prompted a defense of traditional textbooks in a commentary published in the Lancet (Weatherall et al. 1995).
The evolution of a new approach to reviews of medical research
Thousands of systematic reviews and meta-analyses have now been published and they are now the most frequently cited form of clinical research (Patsopoulos et al. 2005). However, the challenge of keeping reviews up to date as new evidence accumulates has not yet been solved. The 1992 articles in JAMA and the New England Journal of Medicine showed retrospectively what could have been known about treatments for myocardial infarction had the results of each new trial been added to those already to hand. Their findings gave urgency to the idea that not only were we not making use of evidence already published, but we had to have a system to increase greatly the dissemination of good evidence. Failure to make use of all available evidence sometimes had lethal consequences.
A new system was emerging with the creation of the Cochrane Collaboration (www.cochrane.org), a non-profit, international organization which was inaugurated formally in 1993 to prepare, maintain and disseminate systematic reviews of the effects of health care (Chalmers I 1993). The growth of the Cochrane Collaboration was very rapid, partly because of the large numbers of people who volunteered to help it achieve its objectives, but also because the internet, the World Wide Web and the spread of personal computers provided easy, fast and cheap communication. These electronic resources also provided the perfect medium for updating evidence, in contrast to reviews published in print journals and textbooks.
However, the challenge of keeping existing systematic reviews up to date has not yet been cracked by any organization in the world, including the Cochrane Collaboration, and authors and editors of journals are still not taking seriously the need for new results to be set, systematically, in the context of relevant existing evidence (Clarke and Chalmers 1998; Clarke et al. 2002; Clarke et al. 2007; Chalmers and Glasziou 2009). So the problem identified so clearly in the paper by Antman and his colleagues has still not been overcome, and this means that patients continue to suffer unnecessarily.
Tom Chalmers’ publishing career in clinical trials began in 1955 with a remarkable report of a randomized factorial trial of bed rest and diet for hepatitis (Chalmers TC et al. 1955). In a personal reflection on the importance of this paper, the clinical epidemiologist David Sackett (2008) wrote: “Reading this paper not only changed my treatment plan for my patient. It forever changed my attitude toward conventional wisdom, uncovered my latent iconoclasm, and inaugurated my career in what I later labeled ‘clinical epidemiology.’” Similarly, the article by Tom’s group that DR had helped publish in 1977 in the New England Journal of Medicine completely changed the way DR thought about medicine and approached the evidence (Chalmers TC et al. 1977). In the early 1990s, after being shown early versions of the analyses that would form the basis of the article by Antman et al.(1992), IC suggested to Tom that it would come to be regarded as the most important of his many important publications. This commentary on the article is a tribute to all of the authors of the article by Elliot Antman, Joseph Lau, Bruce Kupelnick, Frederick Mosteller and Tom Chalmers, but to Tom particularly. He died four years after it was published, but the article has enduring importance for clinicians and patients alike.
Antiplatelet Trialists’ Collaboration (1988). Secondary prevention of vascular disease by prolonged anti-platelet treatment. BMJ 296:320-331.
Antman EM, Lau J, Kupelnick B, Mosteller F, Chalmers TC (1992). A comparison of results of meta-analyses of randomized control trials and recommendations of clinical experts. Treatments for myocardial infarction. JAMA 268:240-8.
Baum ML, Anish DS, Chalmers TC, Sacks HS, Smith H, Fagerstrom RM (1981). A survey of clinical trials of antibiotic prophylaxis in colon surgery: Evidence against further use of no-treatment controls. N Engl J Med 305:795-799.
Chalmers F (2009). Thomas C Chalmers (1917-1995). The James Lind Library (https://www.jameslindlibrary.org/articles/thomas-c-chalmers-1917-1995/).
Chalmers I (1993). The Cochrane Collaboration: preparing, maintaining and disseminating systematic reviews of the effects of health care. In: Warren KS, Mosteller F, eds. Doing more good than harm: the evaluation of health care interventions. Annals of the New York Academy of Sciences 703:156-163.
Chalmers I, Glasziou P (2009). Avoidable waste in the production of reporting research evidence. Lancet 374:86-89.
Chalmers I, Hedges L, Cooper H (2002). A brief history of research synthesis. Evaluation and the Health Professions 25:12-37.
Chalmers I, Enkin M, Keirse MJNC (1993). Preparing and updating systematic reviews of randomized controlled trials of health care. Milbank Quarterly 71:411-437.
Chalmers TC, ed (1988). Data analysis for clinical medicine: the quantitative approach to patient care in gastroenterology. Rome: International University Press.
Chalmers TC, Eckhardt RD, Reynolds WE, Cigarroa JG, Deane N, Reifenstein RW, Smith CW, Davidson CS (1955). The treatment of acute infectious hepatitis. Controlled studies of the effects of diet, rest, and physical reconditioning on the acute course of the disease and on the incidence of relapses and residual abnormalities. Journal of Clinical Investigation 34:1163-1235.
Chalmers TC, Matta RJ, Smith H, Kunzler A-M (1977). Evidence favoring the use of anticoagulants in the hospital phase of acute myocardial infarction. New England Journal of Medicine 297:1091-96.
Chalmers TC, Levin HR, Sacks HS, Reitman D, Berrier J, Nagalingam R (1987a). Meta-analysis of clinical trials as a scientific discipline. I. Control of bias and comparison with large cooperative trials. Stat Med 1987;6:315-25.
Chalmers TC, Berrier J, Sacks HS, Levin H, Reitman D, Nagalingam R (1987b). Meta-analysis of clinical trials as a scientific discipline. II. Replicate variability and comparison of studies that agree and disagree. Stat Mad 6:733-44.
Chalmers TC, Gray D, Bhun A, Berlin J, Orza MJ, Nagalingam R, Hewitt P (1988a). Data analysis in gastroenterology. Vagotomy for recurrent duodenal ulcer. Gastroenterology Intl 1:41-7.
Chalmers TC, Berrier J, Hewitt P, Berlin J, Reitman D, Nagalingam R, Sacks H (1988b). Meta¬analysis of randomized control trials as a method of estimating rare complications of non-steroidal anti-inflammatory drug therapy. Aliment Pharmacol Therap 2 (Supp1 1):9-26.
Clarke M, Chalmers I (1998). Discussion sections in reports of controlled trials published in general medical journals: islands in search of continents? JAMA 280:280-282.
Clarke M, Alderson P, Chalmers I (2002). Discussion sections in reports of controlled trials published in general medical journals. JAMA 287:2799-2801.
Clarke M, Hopewell S, Chalmers I (2007). Reports of clinical trials should begin and end with up-to-date systematic reviews of other relevant evidence: a status report. Journal of the Royal Society of Medicine 100:187-190.
Early Breast Cancer Trialists’ Collaborative Group (1988). Effects of adjuvant tamoxifen and of cytotoxic therapy on mortality in early breast cancer. An overview of 61 randomized trials among 28,896 women. N Engl J Med 319:1681-92.
Federman DJ, Mutgi AB (1992). Redundant Publication? New England Journal of Medicine 327:1316.
Glass GV (1976). Primary, secondary and meta-analysis of research. Educational Researcher 10:3-8.
Himel HN, Liberati A, Gelber RD, Chalmers TC (1986). Adjuvant chemotherapy for breast cancer: A pooled estimate based on results from published randomized control trials JAMA 256:1148-1159.
Hine LK, Laird N, Hewitt P, Chalmers TC (1989). Meta-analytic evidence against prophylactic use of lidocaine in acute myocardial infarction. Archives of Internal Medicine 149:2694-8.
Huth EJ (2008). The move toward setting standards for the content of medical review articles. The James Lind Library (https://www.jameslindlibrary.org/articles/the-move-toward-setting-scientific-standards-for-the-content-of-medical-review-articles/).
Lau J, Antman EM, Jimenez-Silva J, Kupelnick B, Mosteller F, Chalmers TC (1992). Cumulative meta-analysis of therapeutic trials for myocardial infarction. New England Journal of Medicine 327:248-254.
Longnecker MP, Berlin JA, Orza MJ, Chalmers TC (1988). A meta-analysis of alcohol consumption in relation to risk of breast cancer. JAMA 260:652-6.
Mulrow CD (1987). The medical review article. Annals of Internal Medicine 106:485-8.
Patsopoulos NA, Apostolos AA, Ioannidis JPA (2005). Relative citation impact of various study designs in the health sciences. JAMA 293:2362-2366.
Pentecost BL (1987). Myocardial infarction. In: Weatherall DJ, Ledingham JGG, Warrell DA, eds. Oxford Textbook of Medicine. 2nd edn, Vol 2, Oxford: Oxford University Press, p 13.173.
Petrosino A (2004). Charles Frederick [Fred] Mosteller (1916-2006). The James Lind Library (https://www.jameslindlibrary.org/articles/charles-frederick-fred-mosteller-1916-2006/).
Sackett D (2008). A 1955 clinical trial report that changed my career. The James Lind Library (https://www.jameslindlibrary.org/articles/a-1955-clinical-trial-report-that-changed-my-career/).
Sacks HS, Chalmers TC, Berk AA, Reitman D (1985). Should mild hypertension be treated? An attempted meta-analysis of the clinical trials. Mt Sinai J Med 52:265-270.
Sacks HS, Berrier J, Reitman D, Ancona-Berk VA, Chalmers TC (1987). Meta-analysis of randomized controlled trials. New England Journal of Medicine 316:450-455.
Stampfer MJ, Goldhaber SZ, Yusuf S, Peto R, Hennekens CH (1982). Effect of intravenous streptokinase on acute myocardial infarction: pooled results from randomized trials. New England Journal of Medicine 307:1180-1182.
Stjernswärd J (1974). Decreased survival related to irradiation postoperatively in early breast cancer. Lancet 304:1285-1286.
Sze PC, Reitman D, Pincus M, Sacks HS, Chalmers TC (1988). Anti-platelet agents and secondary stroke prevention: a meta-analysis of the randomized control trials. Stroke 19:436-42.
Weatherall DJ, Ledingham JG, Warrell DA (1995). On dinosaurs and medical textbooks. Lancet 346:4-5.
Yusuf S, Peto R, Lewis J, Collins R, Sleight P (1985). Beta blockade during and after myocardial infarction: an overview of the randomized trials. Progress in Cardiovascular Disease 27:335-371.