Moberg J, Kramer M (2015). A brief history of the cluster randomized trial design.
Email to someoneTweet about this on TwitterShare on FacebookPin on PinterestShare on LinkedIn

© Jenny Moberg and Michael Kramer. Contact author: Jenny Moberg, Norwegian Knowledge Centre for the Health Services, Boks 7004 St Olavsplass, N-0130 Oslo, Norway. Email: jenny.moberg@kunnskapssenteret.no


Cite as: Moberg J, Kramer M (2015). A brief history of the cluster randomized trial design. JLL Bulletin: Commentaries on the history of treatment evaluation (http://www.jameslindlibrary.org/articles/a-brief-history-of-the-cluster-randomized-trial-design/)


Introduction

The cluster randomized trial (CRT) is commonly considered a relatively new research study design (Donner and Klar 2000; Eldridge and Kerry 2012; Murray 1998). Here we trace to a few very early reports the idea of comparing interventions applied to groups of individuals, through the evolution of this idea to the modern-day CRT.  This has been defined as a comparative study in which the units randomized are pre-existing (natural or self-selected) groups whose members have an identifiable feature in common, and in which outcomes are measured in all, or a representative sample of the individual members of the groups (Donner and Klar 2000). Summaries of the reports of many of the examples of CRTs published before the methodological review of this research design by Donner and his colleagues (1990) can be viewed in the James Lind Library (link to JLL bibliography; http://www.jameslindlibrary.org/additional-methods/allocation-bias/cluster-allocation/ see Appendix for details of our literature search).

What is a cluster?

The groups used in CRTs vary widely and range in size from families to entire communities. The common feature shared by members of a cluster may be:

Why use cluster randomized trials?

CRTs are well suited and are now commonly used to evaluate public health, health policy and health system interventions. They are ideal for testing interventions when the decision (policy) about whether or not to implement the intervention will be taken on behalf of a group. CRTs are also useful when the nature of the intervention carries a high risk of contamination, that is, when individuals randomized to different comparison groups are in frequent contact with one another and thus may be influenced (‘contaminated’), in either or both directions, by the alternative treatment(s). Contamination is likely to occur in comparisons of public health promotion interventions within the same community, and of different approaches to health care provided by the same clinician to patients under his or her care. In addition to these scientific reasons, cluster designs can also have practical advantages over individual randomization because of lower implementation costs, or administrative convenience.

Early examples of group allocation

The earliest mentions of which we are aware of treatment comparisons in which the intervention was assigned to a group, rather than to an individual, are centuries old. In 1648, Van Helmont proposed a trial of his new methods of treating febrile patients without purging and blood-letting, in which the participants would be put into groups then randomized by “casting lots” to decide which group would receive which of the treatments to be compared (Van Helmont 1648). It is unlikely that this trial ever took place, but the idea of cluster randomization is there.

In 1657, Starkey proposed a trial in defence of van Helmont’s treatment methods, in which groups of patients were to be assigned to be treated by Starkey (according to van Helmont’s methods) or by those of van Helmont’s critics.  Starkey seems to have appreciated that the process of treatment allocation should be designed to prevent confounding by differences between the groups receiving different treatments. He suggested that patients first be grouped in tens. Starkey and his opponent should then alternately divide each ten into two groups of five, allowing those who did not do the dividing to choose one of the groups of five patients. The ‘divider’ should then treat the remaining five patients. As in van Helmont’s proposed trial, the groups of patients did not exist prior to the trial but were created specifically for the trial (Starkey 1657).

Celli’s 1900 trial may be the first in which pre-existing groups were allocated to treatment – an important step towards the modern CRT design (Celli 1900; Ferroni et al. 2011). Celli studied whether mosquito netting reduced malaria in households of Italian railway workers. The households were selected (although not randomized) to receive or not receive the intervention. Neighboring households were used as controls. This trial heralds one of the most common uses of CRTs today: the evaluation of infectious disease control methods and, in particular, of methods to prevent malaria.

The clinical trial reported by Amberson and his colleagues in 1931, challenging the use of gold for treating pulmonary tuberculosis, was an early trial using a single coin toss to allocate two matched comparison groups either to injections of a gold-containing treatment (sanocrysin), or to control injections of distilled water. In addition, patients and the investigators measuring the trial outcomes were blinded to the participants’ treatment allocation (Amberson et al. 1931). This trial has sometimes been considered well designed and conducted, but it falls short of the current standards for a cluster randomized trial; the clusters were created for the trial and only two clusters were randomized (Diaz and Neuhauser 2004).

Many early CRTs in non-medical fields were school-based evaluations of educational interventions. Indeed, methodological discussion of the cluster randomized design appears to have begun in 1940 with Lindquist’s book on methods in education research in schools (Lindquist 1940; Klar and Donner 2004). Much of what Lindquist wrote, however, also applies to clinical and public health interventions.

Recent developments

Only sparse use of CRTs was evident before the 1980s (Bland 2004). However, the last half century has seen a steady increase in the number of CRTs published in the medical literature: from one a year in the 1960s; to seven in 1990, when Donner, Brown and Brasher published their methodological review of CRTs (Donner et al. 1990); to over 120 in 2008.

Every pre-1960s CRT of which we are aware tested some aspect of infectious disease prevention or treatment (Donner et al. 1990; Coburn 1944; Mellanby et al. 1948; Comstock 1962).  In the 1970’s, CRTs were used extensively for such trials, particularly in low-income countries (Storey et al. 1973; Sutter and Ballard 1983; Isaakidis and Ioannidis 2003). CRTs were also recognized as being suitable for evaluating public health interventions aiming to change health behavior, such as improving dental care (Reiss 1976), promoting hand-washing (Black et al. 1981), and attending for immunization (Yokley and Glenwick 1984).

The risk of ‘contamination’ between comparison groups is high in studies evaluating screening interventions. In the 1980’s two large-scale CRTs of screening interventions showed how this design can be used to reduce the influence of contamination on the effects of an intervention. A trial by Grant and colleagues (1989) evaluated the effect of routine counting of fetal movement by pregnant women on the likelihood of antepartum stillbirth. This showed how the CRT design can be useful for assessing the effects of interventions which would otherwise be compromised by a high risk of contamination (Grant et al. 1989).

The Swedish screening mammography study published in 1985 by Tabár and colleagues (1985) is an example of the use of a CRT to evaluate complex public health screening interventions applied to large populations. By offering screening to a community selected at random from matched communities, separated by 200km on average, contamination among communities was reduced, and trial implementation became more practicable (Tabár et al. 1985). These features of CRTs – reduction of contamination, and practicability of very large scale public health trials – are also well illustrated in a trial of the effect of vitamin A supplementation on childhood mortality, morbidity, and preschool growth (reported in Sommer et al. 1986; West et al. 1988; Abdeljaber et al. 1991).

CRTs were also shown to be useful for evaluating the impact of multi-faceted approaches to health improvement, for example, trials of nutritional supplementation and maternal education in expectant mothers and infants at risk of malnutrition (Waber et al. 1981), and a trial of breast cancer screening methods and the nurses who implemented them (Roberts et al. 1984).

Schools have often been used in public health CRTs. They are convenient places to implement health education interventions relevant to children and adolescents, such as prevention of tobacco, alcohol, and drug use; promotion of sexual health; and primary prevention of chronic disease through promotion of healthy eating and physical activity. Entire schools or classes within schools are ready-made clusters (Dwyer et al. 1983; Lloyd 1983; Simons-Morton 1984; Dielman et al. 1989; Schinke et al. 1986). School clusters have also been used to evaluate interventions aimed at helping children to become ‘health messengers’, as in a trial assessing whether hypertension education of children had an impact on the blood pressure of their parents (Fors et al. 1989).

CRTs have become recognized as being valuable in evaluating many different types of health system interventions – health care delivery (Bass et al. 1986; Choi et al. 1986; Seto et al. 1989), governance, financial arrangements, and implementation strategies. A CRT reported by Vogt et al. in 1983 was used to compare methods to increase reporting of notifiable diseases by doctors (Vogt et al. 1983). This trial also illustrates an important group of CRTs in which clusters consist of patients treated by the same clinician. These trials are often called ‘professional cluster randomized trials’. A clinician is randomly assigned to the intervention, and the intervention is targeted at individual clinicians – not at his or her patients. Such clinician-targeted interventions often aim to modify the behaviour of healthcare providers in some way, for example, by using clinical guidelines, training, or decision support systems (Chassin and McCue 1986; McDonald 1984; Stross et al. 1986; Evans et al. 1986).

In some such trials, clinicians implement the intervention without involving patients in the decision, for example, in reporting cases of a notifiable disease (Vogt et al. 1983), or arranging for medical assistants to screen for and manage patients with hypertension (Bass et al. 1986). With other clinician-targeted interventions, the intervention is intended to impact on both clinician practices and on patient outcomes, for example, educational programmes to help physicians improve blood pressure control among their patients (Evans et al. 1986).

Through the 1990s, the number of published trials including cluster randomization increased (Bland 2004), and the terms ‘group randomized’ (Murray 1998), ‘community randomized’ (Donner et al. 1990; Donner and Klar 2000), and even ‘place randomized’ were all used to describe CRTs. A BMJ series on statistics in 1997 and 1998 used the term ‘cluster randomized’ (Bland and Kerry 1997; Kerry and Bland 1998). By the early 2000s, with the published extension of the CONSORT statement on reporting guidelines for CRTs (Campbell et al. 2004) and several reviews of CRTs (Puffer 2003; Isaakidis and Ioannidis 2003; Eldridge et al. 2004), the term ‘cluster randomized trial’ had become the most commonly used term for this design. The publication of important, large scale, well-conducted CRTs in this century, such as those evaluating the effects of community groups on birth and other outcomes in poor rural populations (Manandhar 2004; Morrison et al. 2011; Azad et al. 2010; Tripathy et al. 2010; Lewycka et al. 2013), can be considered as a ‘coming of age’ of the CRT design.

Current and future challenges  

Study design
As experience with CRTs has increased over time, difficulties and problems have become apparent. The design (especially with respect to blinding), analysis, and conduct of CRTs are often more complicated than for individually randomized trials. CRTs are conducted with as few as one intervention and one control cluster, and insufficient numbers of clusters (inadequate sample sizes) is a persistent problem. We have not included CRTs with fewer than two clusters in each arm in the JLL, or as examples in this article. Stratification and matching have been used to increase the comparability of the clusters, which helps increase precision with small numbers of clusters. While blinding of participants and outcome assessors is ideal in all randomized trials, it is often difficult or even impossible in CRTs. The units of randomization and the units of observation may be different, and this affects informed consent, recruitment, sample sizes, randomization, and analysis.

Study analysis
CRTs can be analysed in the same way as any individually randomized trial by each cluster providing one data item into the analysis, for example average blood pressure among all patients of a randomized physician. By using all the individual data points in each cluster in the analysis, the statistical power of a trial can be increased. However the effect of clustering must be considered. As early as 1940, Lindquist recognized the need to account for clustering in the analysis of CRTs (Lindquist 1940; Klar and Donner 2004). In 1978, Cornfield pointed out the need for special consideration of the statistical features of CRTs in health research and, in particular, the need to account for between- cluster variation (Cornfield 1978). And as Donner and Klar pointed out in 2000, analysis must also take into consideration variation in cluster size, which is often substantial (Donner and Klar 2000).

Members of clusters are more likely to have similar outcomes than a randomly selected sample of individuals from the same population, particularly when members self-select into a cluster. The most commonly used measure of the degree of similarity among members of a cluster is the intra-class correlation coefficient (ICC). The larger the ICC, the larger the number clusters of individuals needed to achieve comparable statistical power to a trial using individual randomization. Analysing a CRT without accounting for clustering yields a falsely low estimate of variance and hence inflates statistical significance. And as shown by Kramer et al. (2009), loss of statistical power is even more dramatic when the outcome measurements are also clustered with treatment (‘double jeopardy’).

A recent study re-analysing the results from CRTs of health system interventions using time series methods shows that, if data from CRTs are analysed without taking account of trends over time, the findings may be misleading (Fretheim et al. 2014). Fretheim and colleagues suggest adding times series approaches to the overall comparison of randomized groups, so as to gauge changes in effect of the intervention over time.

Reporting
The quality of reports of CRTs has been very variable. Some studies are reported simply as ‘randomized trials’, leaving readers unaware that the unit of randomization is anything other than the individual. Specific key words that would identify CRTs are often not provided in abstracts, so full-text publications have to be retrieved to establish whether or not the study reported was a CRT. Despite extension of the CONSORT statement on reporting guidelines for CRTs (Campbell et al. 2004), the titles or abstracts of 50% of CRTs still fail to indicate this (Taljaard et al. 2010). Conversely, other papers report CRTs in their titles but, on careful inspection, are clearly not CRTs (Bland 2004).

Summing up
CRTs have a long history in both educational and health research and, over the last several decades, have assumed an increasing role in rigorous evaluations of complex clinical, public health, and health system interventions in which individual randomization is likely to be ‘contaminated’ by contact among individual participants randomized. CRTs can also help overcome the administrative barriers and economic costs inherent in contacting, recruiting, and randomizing large numbers of individuals. Major challenges include ensuring statistical power by recruiting adequate numbers of clusters, possible use of randomized crossover of clusters (Connolly et al. 2013; Bellomo et al. 2013; Stockwell et al. 2015), minimizing intra-cluster correlation of outcomes and measurements, and better reporting.

Acknowledgements
We are grateful to Tikki Pang for providing WHO support to assist in the identification of reports of cluster randomized trials, to Marit Johansen for searching for them, and to Allan Donner and Atle Fretheim for helpful comments on an earlier draft.

This James Lind Library article has been republished in the Journal of the Royal Society of Medicine 2015;108:192-198. Print PDF

References

Abdeljaber MH, Monto AS, Tilden RL, Schork MA, Tarwotjo I (1991). The impact of vitamin A supplementation on morbidity: a randomized community intervention trial. Am J Public Health 81:1654-6.

Amberson JB, McMahon BT, Pinner M (1931). A clinical trial of sanocrysin in pulmonary tuberculosis. American Review of Tuberculosis 24:401-435.

Azad K, Barnett S, Banerjee B, Shaha S, Khan K, Rego AR, Barua S, Flatman D, Pagel C, Prost A, Ellis M, Costello A  (2010). Effect of scaling up women’s groups on birth outcomes in three rural districts in Bangladesh: a cluster-randomized controlled trial. Lancet 375:1193-202.

Bass MJ, McWhinney IR, Donner A (1986). Do family physicians need medical assistants to detect and manage hypertension? CMAJ 134:1247-55.

Bellomo R, Forbes A, Akram M, Bailey M, Pilcher DV, Cooper DJ (2013). Why we must cluster and cross over. Critical Care and Resuscitation 15:155-57.

Biglan A, Severson H, Ary D, Faller C, Gallison C, Thompson R, Glasgow R, Lichtenstein E (1987). Do smoking prevention programs really work? Attrition and internal and external validity of an evaluation of a refusals skills training program. Journal of Behavioral Medicine 10:159-71.

Black RE, Dykes AC, Anderson KE, Wells JG, Sinclair SP, Gary GW Jr, Hatch MH, Gangarosa EJ (1981). Handwashing to prevent diarrhea in day-care centers American Journal of Epidemiology. 113:445-51.

Bland JM (2004). Cluster randomized trials in the medical literature: two bibliometric surveys. BMC Medical Research Methodology 4:21.

Bland JM, Kerry SM. Statistics notes (1997). Trials randomised in clusters. BMJ 315:600.

Bush PJ, Zuckerman AE, Theiss PK, Taggart VS, Horowitz C, Sheridan MJ, Walter HJ (1989). Cardiovascular risk factor prevention in black schoolchildren: two-year results of the “Know Your Body” program. American Journal of Epidemiology 129:466-82.

Campbell MK, Elbourne DR, Altman DG; CONSORT group (2004). CONSORT statement: extension to cluster randomised trials. BMJ 328:702-8.

Celli A (1900). The new prophylaxis against malaria in Lazio. Lancet 156:1603-1606.

Chassin MR, Mccue SM (1986). A randomized trial of medical quality assurance. Improving physicians’ use of pelvimetry. JAMA 256:1012-1016.

Choi T, Jameson H, Brekke ML, Podratz RO, Mundahl H (1986). Effects on nurse retention. An experiment with scheduling. Medical Care 24:1029-43.

Coburn AF (1944). The prevention of respiratory tract bacterial infections. JAMA 126:88-89.

Comstock GW (1962). Isoniazid prophylaxis in an undeveloped area. American Review of Respiratory Disease 86:810-822.

Connolly SJ, Philippon F, Longtin Y, Casanova A, Birnie DH, Exner DV, Dorian P, Prakash R, Alings M, Krahn AD (2013). Randomized cluster crossover trials for reliable, efficient, comparative effectiveness testing: design of the Prevention of Arrhythmia Device Infection Trial (PADIT). Can J Cardiol 29:652-8.

Connor MK, Smith LG, Fryer A, Erickson S, Fryer S, Drake J (1986). Future Fit: a cardiovascular health education and fitness project in an after-school setting. Journal of School Health 56:329-33.

Cornfield J (1978). Randomization by group: a formal analysis. Am J Epidemiol 108:100-2.

Diaz M, Neuhauser D (2004). Lessons from using randomization to assess gold treatment for tuberculosis. JLL Bulletin: Commentaries on the history of treatment evaluation (www.jameslindlibrary.org).

Dielman TE, Shope JT, Leech SL, Butchart AT (1989). Differential effectiveness of an elementary school-based alcohol misuse prevention program. Journal of School Health 59:255-63.

Donner A, Brown KS, Brasher P (1990). A methodological review of non-therapeutic intervention trials employing cluster randomization, 1979-1989. International Journal of Epidemiology 19:795-800.

Donner A, Klar N (2000). Design and analysis of cluster randomization trials in health research. London: Arnold.

Dwyer T, Coonan WE, Leitch DR, Hetzel BS, Baghurst RA (1983). An investigation of the effects of daily physical activity on the health of primary school students in South Australia. Int J Epidemiol 12:308-13.

Eldridge S, Kerry S (2012). A practical guide to cluster randomized trials in health services research. Chichester: Wiley.

Evans CE, Haynes RB, Birkett NJ, Gilbert JR, Taylor DW, Sackett DL, Johnston ME, Hewson SA (1986). Does a mailed continuing education program improve physician performance? Results of a randomized trial in hypertensive care. JAMA 255:501-4.

Farr BM, Hendley JO, Kaiser DL, Gwaltney JM (1988). Two randomized controlled trials of virucidal nasal tissues in the prevention of natural upper respiratory infections. Am J Epidemiol 128:1162-72.

Ferroni E, Jefferson T, Gachelin G (2011). Angelo Celli and research on the prevention of malaria in Italy a century ago. JLL Bulletin: Commentaries on the history of treatment evaluation (www.jameslindlibrary.org).

Flay BR, Ryan KB, Best JA, Brown KS, Kersell MW, d’Avernas JR, Zanna MP (1985). Are social-psychological smoking prevention programs effective? The Waterloo study. Journal of Behavioural Medicine 8:37-59.

Fors SW, Owen S, Hall WD, McLaughlin J, Levinson R (1989). Evaluation of a diffusion strategy for school-based hypertension education. Health Education Quarterly 16:255-61.

Fretheim A, Zhang F, Ross-Degnan D, Oxman AD, Cheyne H, Foy R, Goodacre S, Herrin J, Kerse N, McKinlay RJ, Wright A, Soumerai SB (2014). A reanalysis of cluster randomized trials showed interrupted time-series studies were valuable in health system evaluation. Journal of Clinical Epidemiology, In press. http://www.jclinepi.com/article/S0895-4356(14)00409-0/fulltext.

Grant A, Elbourne D, Valentin L, Alexander S (1989). Routine formal fetal movement counting and risk of antepartum late death in normally formed singletons. Lancet 12;2:345-9.

Gyorkos TW, Frappier-Davignon L, MacLean JD, Viens P (1989). Effect of screening and treatment on imported intestinal parasite infections: results from a randomized, controlled trial. American Journal of Epidemiology 129:753-61.

Horwitz O, Magnus K (1974). Epidemiologic evaluation of chemoprophylaxis against tuberculosis. American Journal of Epidemiology 99:333-242.

Isaakidis P, Ioannidis JP (2003). Evaluation of cluster randomized trials in Sub-Saharan Africa. Am J Epidemiol 158:921-926.

Job-Spira N, Meyer L, Bouvet E, Janaud A, Spira A (1988). The prevention of sexually transmitted diseases which affect fertility: methodological problems and initial results. European Journal of Obstetrics & Gynecology and Reproductive Biology 27:157-164.

Klar N, Donner A (2004). The impact of EF Lindquist’s 1940 text “Statistical Analysis in Educational Research” on cluster randomization. JLL Bulletin: Commentaries on the history of treatment evaluation (www.jameslindlibrary.org).

Kerry SM, Bland JM (1998). The intracluster correlation coefficient in cluster randomisation. BMJ 316:1455.

Kornitzer M, Rose G (1985). WHO European Collaborative Trial of multifactorial prevention of coronary heart disease. Preventive Medicine 14:272-8.

Kramer MS, Martin RM, Sterne JA, Shapiro S, Mourad D, Platt RW (2009). The double jeopardy of clustered measurement and cluster randomisation. BMJ 339:503-505.

Lewycka S1, Mwansambo C, Rosato M, Kazembe P, Phiri T, Mganga A, Chapota H, Malamba F, Kainja E, Newell ML, Greco G, Pulkki-Brännström AM, Skordis-Worrall J, Vergnano S, Osrin D, Costello A (2013). Effect of women’s groups and volunteer peer counselling on rates of mortality, morbidity, and health behaviours in mothers and children in rural Malawi (MaiMwana): a factorial, cluster-randomized controlled trial. Lancet 381:1721-35.

Lindquist EF (1940). Statistical analysis in educational research. Boston: Houghton Mifflin.

Lloyd DM, Alexander HM, Callcott R, Dobson AJ, Hardes GR, O’Connell DL, Leeder SR (1983).  Cigarette smoking and drug use in schoolchildren: III-evaluation of a smoking prevention education programme. Int J Epidemiol 12:51-8.

Manandhar DS, Osrin D, Shrestha BP, Mesko N, Morrison J, Tumbahangphe KM, Tamang S, Thapa S, Shrestha D, Thapa B, Shrestha JR, Wade A, Borghi J, Standing H, Manandhar M, Costello AM; Members of the MIRA Makwanpur trial team  (2004). The effect of a participatory intervention with women’s groups on birth outcomes in Nepal: cluster randomized controlled trial. Lancet 364:970-9.

Mayer JA, Dubbert PM, Scott RR, Dawson BL, Ekstrand ML, Fondren TG (1987). Breast self-examination: The effects of personalized prompts on practice frequency. Behavior Therapy 18:135-46.

McDonald CJ, Hui SL, Smith DM, Tierney WM, Cohen SJ, Weinberger M, McCabe GP (1984). Reminders to physicians from an introspective computer medical record. A two-year randomized trial. Ann Intern Med 100:130-8.

Mellanby H, Andrewes CH, Dudgeon JA, Mackay DG (1948). Vaccination against influenza A. Lancet 251:978-982.

Morrison J, Tumbahangphe KM, Budhathoki B, Neupane R, Sen A, Dahal K, Thapa R, Manandhar R, Manandhar D, Costello A, Osrin D (2011). Community mobilisation and health management committee strengthening to increase birth attendance by trained health workers in rural Makwanpur, Nepal: study protocol for a cluster randomized controlled trial. Trials 12:128.

Murray DM. Design and analysis of group-randomized trials. New York: Oxford University Press; 1998. Monographs in Epidemiology and Biostatistics, Vol 27.

Puffer S, Torgerson D, Watson J (2003). Evidence for risk of bias in cluster randomised trials: review of recent trials published in three general medical journals. BMJ 327:785-9.

Puska P, Nissinen A, Pietinen P, Iacono J (1985). Role of dietary fat in blood pressure control. Scandinavian Journal of Clinical and Laboratory Investigation, Supplement, pp. 62-9.

Reiss ML, Piotrowski WD, Bailey JS (1976). Behavioral community psychology: encouraging low-income parents to seek dental care for their children. Journal of Applied Behavior Analysis 9:87-97.

Roberts MM, Alexander FE, Anderson TJ, Forrest APM, Hepburn W, Huggins A, Kirkpatrick AE, Lamb J, Lutz W, Muir BB (1984). The Edinburgh randomised trial of screening for breast cancer: Description of method. British Journal of Cancer 50:1-6.

Schinke SP, Gilchrist LD, Schilling RF, Senechal VA (1986). Smoking and smokeless tobacco use among adolescents: trends and intervention results. Public Health Reports 101:373-8.

Seto WH, Ching PTY, Fung JPM, Fielding R (1989). The role of communication in the alteration of patient-care practices in hospital-a prospective study. Journal of Hospital Infection 14:29-37.

Simons-Morton BG, Coates TJ, Saylor KE, Sereghy E, Barofsky I (1984). Great Sensations: a program to encourage heart healthy snacking by high school students. Journal of School Health 54:288-91.

Sommer A, Tarwotjo I, Djunaedi E, West KP Jr, Loeden AA, Tilden R, Mele L (1986). Impact of vitamin A supplementation on childhood mortality: A randomized controlled community trial. Lancet 24;1:1169-73.

Stanton BF, Clemens JD (1987). An educational intervention for altering water-sanitation behaviors to reduce childhood diarrhea in urban Bangladesh. II. A randomized trial to assess the impact of the intervention on hygienic behaviors and rates of diarrhea. American Journal of Epidemiology 125:292-301.

Starkey G (1657). Nature’s explication and Helmont’s vindication, or a short and sure way to a long and sound life. London: E Cotes for Thomas Alsop at the two Sugar-loaves over against St Antholin’s Church at the lower end of Watling Street.

Storey J, Rossi-Espagnet A, Mandel SPH, Matsushima T, Lietaert P, Thomas D, Brøgger S, Duby C, Gramiccia G (1973). Sulfalene with pyrimethamine and chloroquine with pyrimethamine in single-dose treatment of Plasmodium falciparum infections. Bulletin of the World Health Organisation 49: 275-282.

Stockwell MS, Catallozzi M, Camargo S, Ramakrishnan R, Holleran S, Findley SE, Kukafka R, Hofstetter AM, Fernandez N, Vawdrey DK (2015). Registry-linked electronic influenza vaccine provider reminders: a cluster-crossover trial. Pediatrics. 2015 Jan;135(1):e75-82.

Stross JK, Banwell BF, Wolf FM, Becker MC (1986). Evaluation of an education program on the management of rheumatic diseases for physical therapists. Journal of Rheumatology 13:374-8.

Sutter EE, Ballard RC (1983). Community participation in the control of trachoma in Gazankulu. Social Science & Medicine 17:22:1813-1817.

Tabár L, Fagerberg CJ, Gad A, Baldetorp L, Holmberg LH, Gröntoft O, Ljungquist U, Lundström B, Månson JC, Eklund G, Day NE, Pettersson F (1985). Reduction in mortality from breast cancer after mass screening with mammography Randomised trial from the Breast Cancer Screening Working Group of the Swedish National Board of Health and Welfare. Lancet 13;1:829-32.

Tripathy P, Nair N, Barnett S, Mahapatra R, Borghi J, Rath S, Rath S, Gope R, Mahto D, Sinha R, Lakshminarayana R, Patel V, Pagel C, Prost A, Costello A (2010). Effect of a participatory intervention with women’s groups on birth outcomes and maternal depression in Jharkhand and Orissa, India: a cluster-randomized controlled trial. Lancet 375:1182-92.

Van Helmont JB (1648). Ortus medicinæ: Id est, initia physic inaudita. Progressus medicinæ novus, in morborum ultionem, ad vitam longam. [The dawn of medicine: that is the beginning of a new physic: A new advance in medicine, a victory over disease, to (promote) a long life]. Amsterdam: Apud Ludovicum Elzevirium.

Vartiainen E, Puska P, Tossavainen K (1986). Prevention of non-communicable diseases: Risk factors in youth. The North Karelia Youth Project (1984-88). Health Promotion 1:269-83.

Vogt RL, Larue D, Klaucke DN, Jillson DA (1983). Comparison of an active and passive surveillance system of primary care providers for hepatitis, rubella, and salmonellosis in Vermont. Am J Public Health 73:795-7.

Waber DP, Vuori-Christiansen L, Ortiz N, Clement JR, Christiansen NE, Mora JO, Reed RB, Herrera MG (1981). Nutritional supplementation, maternal education, and cognitive development of infants at risk of malnutrition. Am J Clin Nutr (Suppl 4):807-13

Walter HJ, Hofman A, Connelly PA, Barrett LT, Kost KL (1985). Primary prevention of chronic disease in childhood: changes in risk factors after one year of intervention. American Journal of Epidemiology 122:722-81.

West KP, Djunaedi E, Pandji A, Kusdiono, Tarwotjo I, Sommer A (1988). Vitamin A supplementation and growth: a randomized community trial. American Journal of Clinical Nutrition 48:1257-64.

Wilson DM, Taylor DW, Gilbert JR, Best JA, Lindsay EA, Willms DG, Singer J (1988). A randomized trial of a family physician intervention for smoking cessation. JAMA 260:1570-4.

World Health Organisation European Collaborative Group (1986). European collaborative trial of multifactorial prevention of coronary heart disease: final report on the 6-year results. Lancet 19;1:869-72.

Yokley JM, Glenwick DS (1984). Increasing the immunization of preschool children; an evaluation of applied community interventions. Journal of Applied Behaviour Analysis 17:313-325.

Appendix: Literature search for pre-1990 reports of Cluster Randomized Trials (CRTs)

We identified reports of cluster randomized or quasi-randomized trials trials (CRTs) of any healthcare intervention published before 1990 with at least two clusters in each arm.

We searched records submitted to the Cochrane Central Register of Controlled Trials (CENTRAL) by the following six Cochrane groups: Effective Practice and Organization of Care (EPOC); Consumers and Communication (COMMUN); Public Health (PUBHEALTH); HIV/AIDS (HIV); Sexually Transmitted Diseases (STD); and Infectious Diseases (INFECTN). The combined search strategies are shown below. We also hand-searched the reference lists of key review and methodology papers, and unpublished databases found through personal contacts. We screened titles and abstracts from these sources and retrieved full texts for all possible candidate studies.

Search for pre 1990 CRTs

Cochrane Central Register of Controlled Trials (CENTRAL), 2010, Issue 3, part of The Cochrane Library. www.thecochranelibrary.com (accessed 20 September 2010)

  1. MeSH descriptor Cluster Analysis, this term only
  2. MeSH descriptor Small-Area Analysis, this term only
  3. cluster*:ti,ab,kw
  4. (randomi* or randomly) NEAR/3 (group or groups or community or communities or site or sites or district or districts or institution or institutions or hospital or hospitals or ward or wards or unit or units or clinic or clinics or department or departments or facility or facilities or center or centers or centre or centres or school or schools or village or villages):ti,ab,kw
  5. (community or communities) NEAR/3 intervention*:ti,ab,kw
  6. (#1 OR #2 OR #3 OR #4 OR #5)
  7. (#6), to 1990

Search for the register of policy CRTs

Cochrane Central Register of Controlled Trials (CENTRAL) 2010, Issue 3, part of The Cochrane Library. www.thecochranelibrary.com (accessed 27 September 2010)

  1. MeSH descriptor Cluster Analysis, this term only
  2. MeSH descriptor Small-Area Analysis, this term only
  3. cluster*:ti,ab,kw
  4. (randomi* or randomly) NEAR/3 (group or groups or community or communities or site or sites or district or districts or institution or institutions or hospital or hospitals or ward or wards or unit or units or clinic or clinics or department or departments or facility or facilities or center or centers or centre or centres or school or schools or village or villages):ti,ab,kw
  5. (community or communities) NEAR/3 intervention*:ti,ab,kw
  6. (#1 OR #2 OR #3 OR #4 OR #5)
  7. “SR-EPOC” or “SR-COMMUN” or “SR-PUBHLTH” or “SR-HIV” or “SR-INFECTN” or “SR-STD”
  8. (#6 AND #7)

Reviews, methodology papers, books searched

Bowater RJ, Abdelmalik SM, Lilford RJ (2009). The methodological quality of cluster randomized controlled trials for managing tropical parasitic disease: a review of trials published from 1998 to 2007 .Trans R Soc Trop Med Hyg 103(5):429-36.

Donner A, Klar N (2000). Design and analysis of cluster randomization trials in health research. London: Arnold.

Eldridge SM, Ashby D, Feder GS, Rudnicka AR, Ukoumunne OC (2004). Lessons for cluster randomized trials in the twenty-first century: a systematic review of trials in primary care. Clin Trials 1(1):80-90.

Eldridge S, Ashby D, Bennett C, Wakelin M, Feder G (2008). Internal and external validity of cluster randomized trials: systematic review of recent trials. BMJ 336(7649):876-80.

Handlos LN, Chakraborty H, Sen PK (2009). Evaluation of cluster-randomized trials on maternal and child health research in developing countries. Trop Med Int Health 14(8):947-56.

Isaakidis P, Ioannidis JP (2003). Evaluation of cluster randomized controlled trials in sub-Saharan Africa. Am J Epidemiol 158(9):921-6.

Kramer MS, Martin RM, Sterne JA, Shapiro S, Mourad D, Platt RW (2009). The double jeopardy of clustered measurement and cluster randomisation. BMJ 339:503-505.

Taljaard M, McGowan J, Grimshaw JM, Brehaut JC, McRae A, Eccles MP, Donner A (2010). Electronic search strategies to identify reports of cluster randomized trials in MEDLINE: low precision will improve with adherence to reporting standards. BMC Med Res Methodol 16;10:15.