Chalmers I (2013). UK Medical Research Council and multicentre clinical trials: from a damning report to international recognition.

© Iain Chalmers, James Lind Library, Summertown Pavilion, Middle Way, Oxford OX2 7LG, UK. E-mail: ichalmers@jameslind.net


Cite as: Chalmers I (2013). UK Medical Research Council and multicentre clinical trials: from a damning report to international recognition. JLL Bulletin: Commentaries on the history of treatment evaluation (https://www.jameslindlibrary.org/articles/uk-medical-research-council-and-multicentre-clinical-trials-from-a-damning-report-to-international-recognition/)


Introduction

The UK Medical Research Council (MRC) had its 100th birthday in 2013, and its ‘Centenary Timeline’  (http://www.centenary.mrc.ac.uk/) contains the following brief reference to the Council’s role in the development of clinical trials:

1940-1949: “Randomised controlled trial design pioneered.
MRC scientists developed what is today the gold standard for clinical trial design while testing streptomycin to treat pulmonary tuberculosis.

The website directs visitors to a British Medical Journal video made in 2009, in which Colin Blakemore, a neurophysiologist and former MRC Chief Executive, speaks to John Crofton (a pioneer of tuberculosis trials) and me about how randomization and blinding in clinical trials has helped to generate reliable evidence to inform clinical practice. (Watch the video of Colin Blakemore and John Crofton).

Surprisingly, the MRC itself has been curiously silent about the enduring value of its role in developing clinical trial methods, and about the important methodological legacy left by the director of its Statistical Research Unit – Austin Bradford Hill. Stephen Lock, a former editor of the British Medical Journal, has suggested that ‘the randomised controlled trial is a British invention’ and that Bradford Hill should have had a Nobel prize for his key role in helping to put medicine on a rational scientific footing (Lock 1994).

Although the MRC draws attention to some of the influential randomized trials that it has funded, it does not really celebrate the key role it played in the 1950s in developing and applying clinical trial methods (see Appendix). The MRC’s achievements in this sphere had become clear by the 1970s, yet a 2-volume, 700-page history of the MRC published in 1975 (made available on the Centenary website) assigns a mere 5 pages to ‘Clinical evaluation of remedies’, and fails to highlight the MRC’s role in developing scientifically more robust study designs (Landsborough Thomson 1975). This is rather like referring only to uses of the polymerase chain reaction without referring to the fundamental importance of developing the method itself. Furthermore, reference to the important emergence of multicentre controlled trials is recognised in just two lines (Scadding 2000) in the 83-page report of a Wellcome Witness Seminar on clinical research in Britain between 1950 and 1980 (Reynolds and Tansy 2000). 

How might the apparent reluctance of the MRC (and others) to take credit for something so creditworthy be explained? Research by Desirée Cox-Maximov (1997), Ben Toth (1998), Martin Edwards (2004) and Keith Williams (2005) helps to explain the Council’s lack of interest in controlled clinical trials during the 1930s.  More research is needed to gather relevant data and understand the MRC’s continuing lukewarm celebration of its role in developing clinical trial methods. I hope this article will help to prompt such research.

The MRC’s first multicentre controlled clinical trial – a bit of a shambles

Methods to improve the design of clinical trials took an important step forward with the adoption of alternate allocation schedules to create similar treatment comparison groups. Alternation began to be used in earnest at the beginning of the 20th century (Chalmers et al. 2011), notably in research on plague and cholera in India (Choksy 1900; Haffkine 1900; Bannerman 1905). The BMJ’s response to the Plague Commission’s emphasis on this feature of research design suggests that some people at least felt that it was no longer necessary to keep repeating this obviously desirable feature of reliable clinical trials:

[The Commission] lay the chief stress upon the fallacies resulting from an improper selection of control cases. Their lengthy and laboured criticisms on this matter are uninteresting. It is quite obvious that some of the statistics above quoted are unconvincing. The only important point the Commissioners bring out is that the one indisputable guarantee of the efficacy of a serum is its success when applied on the “alternate method,” every personal factor in the selection of cases being rigidly excluded” (BMJ 1902).

Alternate allocation was also a key feature of controlled trials of serum treatment for pneumonia done in the United States during the second and third decades of the 20th century (Podolsky 2006; 2008) . The results of these studies were objectively very encouraging, yet, in Britain, there was widespread scepticism about the value of a treatment that was quite expensive and also tedious to administer (Worboys 1993). Nevertheless, researchers in Edinburgh (Physicians to the Royal Infirmary, Edinburgh 1930) and London (Armstrong and Johnson 1932) began to do small alternate allocation trials of serum treatment for pneumonia.  Although the MRC had provided a grant to Professor Murray Lyon in Edinburgh to buy serum for his alternate allocation trial, the Council was initially reluctant to get involved because FHK Green preferred a case series approach to a controlled experiment using alternation (Toth 1998).

Whatever the reasons may have been for apparently discounting the evidence from the United States, the MRC embarked on what turned out to be a poorly coordinated effort to recast initially separate initiatives in Aberdeen, Edinburgh, Glasgow and London as a multicentre trial, centrally controlled by the Council. In 1931, a year after the Wellcome Physiology Research Laboratory had made serum available to Edinburgh, Aberdeen and St Bartholomew’s Hospital in London, Walter Fletcher, secretary of the MRC, passed responsibility for organising and managing a multicentre trial to the newly formed Therapeutic Trials Committee to control this research (Edwards 2004; 2007), which had been convened to facilitate relations between drug companies, researchers and clinicians.

The first meeting of the Committee recognised that more data were needed and attempted to include a research group from Glasgow. It drew up a standard scheme of enquiry, which included standardised case and control selection and alternate allocation. It is not clear why the MRC recommended alternate allocation. There is no evidence that it did so on the basis of statistical advice, but there may have been some appreciation that having an alternate scheme at each trial centre would make it easier to combine or compare results (Toth 1998).

The four centres were asked to submit their data to MRC staff, who forwarded them to Professor Thomas Elliott, director of the Medical Unit at University College Hospital in London. On 10 November 1933, Professor Elliott chaired a conference to discuss the results, and the MRC asked him to prepare a report for publication, taking into account the variety of interpretations of the data expressed by the investigators (papers at the Public Record Office FD1/2372. Serum treatment of pneumonia).

Some commentators questioned whether the research had actually been needed. As Dr John Cowan of Glasgow wrote in a letter to the Secretary of the MRC:

On the facts available in U.S.A. and at home serum seems to me to be proved to be beneficial in [Type] I and probably proven in [Type] II. It should be available in consequence in ALL hospitals. Why have so many folk – in London and here too – fought shy of it? Why are not Barts etc all using it? The days of controls are no longer possible: it is not fair to them. (John Cowan to FHK Green, 17 November 1933).

In a letter to Dr A Landsborough Thompson at the MRC, Stanley Davidson, Professor of Medicine in Aberdeen, wrote:

Nine cases of Type I pneumonia in the control series died, and only one in the treated. We are unable to explain these excellent results except: (1) on the grounds of the beneficial effects of serum, and (2) that these results are exaggerated by chance, owing to the small number of cases involved. (LSP Davidson to A Landsborough Thompson, 24 November 1933).

To which FHK Green of the MRC responded:

I have told Professor Elliott that the evidence for the ‘miracle of Aberdeen’ appears to be unassailable and he has replied that, this being so, it must clearly be a case of ‘go to Peebles for pleasure: go to Aberdeen if you get pneumonia (FHK Green to LSP Davidson, 28 November 1933).

Soon after the conference of investigators organised by Elliott, the MRC asked the then young statistician Austin Bradford Hill to review the study. Bradford Hill produced a detailed critique in an internal and unpublished report dated 22 December 1933. It is very frustrating that this report, which occupies a key place in the history of controlled trials, has been mislaid. The historian Joan Austoker was able to inspect it at MRC Headquarters during the 1980s (Austoker and Bryder 1989). She reported that Bradford Hill had questioned the methods used in allocating cases into serum and control groups and that he had stressed that greater effort should be taken ’that the division of cases really did ensure a random selection’ (Hill 1933, quoted in Austoker and Bryder 1989).

Data in the report indicate that the alternate allocation scheme had not been rigorously applied. Whatever Bradford Hill’s report concluded, it is clear that it must have been devastating: FHK Green, Secretary of the MRC, deemed it so damning that it was ‘to be kept, not only from public scrutiny, but even from the investigators themselves’ (Edwards 2004 ; 2007 p 99).

The published report of a flawed trial – unexpectedly very good in parts

Bradford Hill’s criticism of the study almost certainly (see below) led the MRC to ask him to help draft the report of this unsatisfactory multicentre trial (Medical Research Council 1934). Some aspects of the report reveal a methodological sophistication which was not evident among members of the Therapeutic Trials Committee (Toth 1998). In 1988, Jan Vandenbroucke (1987) noted that the report contains ’a beautiful discussion of selection and comparability of treatment groups.’  The section entitled ‘Selection of Cases for Treatment’ notes that: (i) the effects of some treatments are so dramatic and constant that carefully controlled research is not necessary; (ii) serum treatment for pneumonia is not such a treatment; (iii) trying to match patients treated with serum with control patients (to ensure that the two comparison groups were alike in all the respects that mattered) is impractical; (iv) assigning cases alternately to either a serum group or a control group addressed the need to have two comparable groups of patients.

The opening paragraph of the section reads as follows:

The good results of insulin on patients with diabetes or of liver treatment in pernicious anaemia are so constant that the trial of these remedies in a very few cases was enough to establish their value. With the antiserum treatment of lobar pneumonia the conditions are very different. The action of the serum is only that of a partial factor for good, and its influence may be overwhelmed by an infection that has been allowed several days to establish its dominance in the patient, or by other complicating factors that weaken the patient’s resistance. In order to measure precisely what this partial benefit may be it would be necessary to take two groups of cases of identical severity and initial history and compare the sickness and the fatality in each, the one being treated with serum and the other serving as a control. But this is impracticable, for very few cases, even of”Type 1″ lobar pneumonia, are quite alike, and a sufficient number of similar cases could never be got together under one observer and under similar conditions. Some American workers have sought to avoid this difficulty by using a special system of ratings for the various harmful features of the disease, thus expressing each patient’s numerical value in reference to a common standard. Such differentiation seemed too intricate, and perhaps too much a matter of personal judgement, for the present inquiry. If a straightforward comparison of treated cases with controls, under the average conditions whereby patients succeed one another in the wards of a hospital, could not reveal any advantage for those treated by serum, then common sense would conclude that the use of this remedy should be disregarded in the routine of practical medicine. The method consequently agreed upon for London, Edinburgh and Aberdeen was that alternate cases of lobar pneumonia, taken simply in the order of their admission to hospital, should be used respectively for serum treatment and controls. So far as possible both were treated in the same wards and under the care of the same physicians. In the independent inquiry at Glasgow, however, the “serum” cases were treated in the Royal Infirmary, and a series of patients of the same social stratum”, admitted during the same period to the Belvedere Isolation Hospital under the care of one physician, served as the control group. It is clear that there may be serious fallacies in any system which contrasts a group of serum treated patients with a control group drawn from a different stratum of the population, or with a control group in a previous year, when the severity of the prevailing pneumonia might have been different.

Later paragraphs in this section of the paper (i) address eligibility criteria; (ii) describe measures to exclude, before allocation, patients deemed unlikely to benefit from serum; (iii) describe the exclusion from the analysis of patients who died less than 24 hours after allocation; and (iv) express concern that the sample may have been too small ‘for statistical purposes’. The second paragraph reads as follows:

Certain principles of selection were laid down so as to make the data derived from the centres homogeneous, and to exclude from the comparison patients in whom the serum could not be expected to have any effect. For the latter reason all patients admitted later than the fifth day of illness were excluded from the inquiry. Also all patients dying within twenty-four hours of admission to hospital were taken out of the series, though the evident severity of their illness would not have prevented their inclusion at first, either in the control or in the serum group. No case of pneumonia complicated by other obvious disease, such as gross nephritis, advanced heart disease, diabetes, etc., was accepted for either group. All forms diagnosed as bronchopneumonia were also excluded. That these limitations were desirable was agreed upon by all the workers at a preliminary conference on the subject. It will be appreciated, however, that, with such restrictions, it was difficult in three years to obtain fully adequate data for statistical purposes.

The final paragraph in this section of the BMJ report is also important in the methodological history of clinical trials. It begins by noting the higher case-fatality rate in older patients, and describes steps taken to reduce the likelihood of chance imbalances in the numbers of older patients between the serum and control groups. It ends by noting the importance of large numbers for reducing chance imbalances in important prognostic factors:

Sex was disregarded, but the question of age was too important to be neglected. Table II from the present series illustrates afresh the well-known fact that the fatality of lobar pneumonia tends to be much greater over the age of 40 than in younger persons. The fortuitous inclusion of a few more elderly patients in one group than the other might influence unfairly the final figures for comparison. It was therefore decided to omit from the series all patients under the age of 20 and over the age of 60, and to classify the remainder into broad age groups. It will be noted that this plan still left altogether unregulated the chance scatter of distribution of patients with severe or mild pneumonia into either the serum or the control groups, and also of those for treatment early or relatively late in the progress of the disease. It was thought better not to attempt a deliberate sorting of cases in respect of mildness or severity, but to trust that the distortion of chance scatter would become almost negligible in a fairly large number of cases. Reference to a possible influence of the “severity factor” on the results is,however, made later in the report.

It is not clear what is implied in the above paragraph by the reference to classifying eligible patients ‘into broad age groups’, but the next section of the report – entitled ‘Statistics of Results’ – suggests that it did not involve alternate allocation within strata defined by age. This section also draws attention to the fact that only in Aberdeen and London was there strict adherence to an allocation schedule based on alternation:

Subject to the criteria mentioned above, patients at London and Aberdeen were placed in the groups for serum treatment, or for control, alternately in the order of their admission to hospital without selection as to age or severity. At Edinburgh the same general rules and criteria were observed, and there was no selection of cases for serum treatment. But in some wards of the General Infirmary serum was not used throughout the whole period of the inquiry, and consequently the patients from these wards overload the number of controls. In the other wards the alternate case plan was maintained to the end. At Glasgow the alternate case plan was not used, but patients in one hospital were treated with serum and those in another hospital served as controls. Hence it is only at Aberdeen and London that the serum treated cases equal the control cases in number.

The analysis of the results involved comparing observed and expected numbers of deaths – the latter being defined as ‘those which would have been recorded if the serum treated groups had died at the same percentage rates as the corresponding controls’. Observed and expected numbers of deaths were compared in strata defined by centre, age of patients, and type of pneumonia. Inspection of these many subgroup analyses suggested variations in effects. The report notes, however: ‘This raised the question of the chance scatter of patients with poorer prognosis from any cause into either the serum or the control group preponderantly’, and goes on to conclude:

The variation in results at the different centres cannot be explained, but they show the difficulties in the way of accurately evaluating a treatment of this nature on the basis of small numbers of cases.

In brief, this report reveals a clear appreciation of (i) possible sources of allocation bias, and ways to reduce it; (ii) concern about the danger of being misled by the play of chance.

In 1988 I sent Bradford Hill a copy of Vandenbroucke’s article, and he responded in a letter to me as follows:

Thank you for sending me the Dutch article on the history of the R.C.T. I am interested in his comment on the M.R.C. Therapeutic Trials Committee’s report on the serum treatment of lobar pneumonia which contains “a beautiful discussion of selection and comparability of treatment groups & that this came before the publication of Fisher’s Design of Experiments.” I feel certain that I wrote that para and I had learned from Pearson & Greenwood & Yule (vide the references No 21 & 22). I had applied that teaching to the M.R.C’s trial of a vaccine against whooping cough and was itching to apply it in the clinical field. Streptomycin provided the opportunity. Of course later I may have been influenced by Fisher but not very much – in fact in his famous ‘tea and milk’ experiment I think he was wrong.[Austin Bradford Hill to Iain Chalmers, 7 August 1988]

The methodological legacy of the MRC’s first multicentre clinical trial

For a decade after the MRC’s initial foray into multicentre clinical trials it made no further attempts to do any more such studies (Bryder 2010). Nevertheless, the lessons learned from this imperfectly conducted trial did pave the way for the methodologically robust trials that were to become a hallmark of the MRC’s work in the 1940s and 1950s.

The limitations of the MRC’s first multicentre clinical trial reflected the lack of relevant methodological experience among members of the MRC Therapeutic Trials Committee. As Ben Toth has noted:

During its existence [the MRC Therapeutic Trials Committee] did not organise one rigorous comparative clinical trial, despite prima facie evidence of the problems of not doing so. None of the factors that were later to be recognised as vital to producing meaningful evaluations of therapies were advocated by the TTC. (Toth 1998 , p 172).

Although the authors of some articles describing alternate allocation trials done in the late 1930s specify that their reports are ‘for the Therapeutic Trials Committee of the Medical Research Council’ (see, for example, Bray and Witts 1934; Snodgrass and Anderson 1937a; 1937b; Anderson 1939), Ben Toth’s assessment is that controlled trials using alternation were irrelevant to the primary objectives of the Therapeutic Trials Committee. Its objectives were to build on the laboratory approach and the MRC’s tradition of biological and physiological standardisation (see, for example, Lewis 1930), and to support the emergent British pharmaceutical industry (Toth 1998).

The lessons from the trial of serum for pneumonia are nevertheless likely to have been very important in leading Bradford Hill and some others within the MRC to go on to design large, methodologically robust trials using concealed allocation schedules. Four years later, in discussing the planning and interpretation of experiments in the Lancet and in the first  edition of his book ‘Principles of Medical Statistics’, Bradford Hill states that the allocation of alternate cases to the treated and control groups ‘is often satisfactory’ because ‘in the long run (emphasis in the original) we can fairly rely upon thisrandom allotment (my emphasis) of the patients to equalise in the two groups the distribution of other characteristics that may be important.’ (Hill 1937, p 5).

This reference to alternate allocation as if it was random allocation was to continue for many years (see, for example, Diehl et al. 1938; Armitage 2002; Anderson and Ferguson 1945; and Appendix). Peter Armitage (1992) has commented that Bradford Hill’s initial failure to distinguish clearly between alternation and randomization was due partly to an underestimate of the danger of selection bias, and partly to a feeling that alternation would be easier to swallow than randomization. In an article published half a century later, Bradford Hill wrote:“ …I was trying to persuade the doctors to come into controlled trials in the very simplest form and I might have scared them off…I thought it would be better to get doctors to walk first, before I tried to get them to run” (Hill 1990)

Bradford Hill goes on in ‘Principles of Medical Statistics’ to outline how allocation can be done within strata defined by characteristics (such as age) known to have an influence on prognosis. In later editions of his book, Bradford Hill used data from the MRC trial of serum treatment for pneumonia to illustrate how this might be important. In the opening chapter of the 1946 edition, for example, he tabulates data (which cannot be derived from the published report) revealing differences in the age distributions of patients in the serum and control groups in the two centres in the study (Aberdeen and London) where alternation was said to have been used throughout the period of recruitment (Hill 1946, p 6-7).

Bradford Hill’s text implies that these differences reflect chance; but he may have suspected that that they reflected biases introduced by failure to adhere strictly to the alternate allocation scheme. Indeed, in spite of his insistence that alternate allocation must be strictly applied, from the first edition of his book onwards, Bradford Hill does not comment on the fact that the totals of 159 and 163 patients in the serum and control groups are clearly incompatible with strict alternate allocation (Hill 1946, p 7).

The imperfectly conducted but carefully assessed and reported MRC multicentre trial of serum treatment for pneumonia seems likely to have played a key role in one of the most important methodological advances in the history of clinical trials (Chalmers 1997; D’Arcy Hart 1999; Chalmers 1999; 2001). Steps were taken subsequently to conceal allocation schedules from those recruiting participants to prevent foreknowledge of treatment allocations.

The first multicentre MRC trial to do this was the trial of patulin for the common cold, which was designed and run by Philip D’Arcy Hart (MRC 1944; Clarke 2004; Chalmers and Clarke 2004).  As he said to me in an interview in 2003:

Everyone had thought we would use alternation, and we thought we were very clever in setting up a scheme with two patulin groups and two placebo groups using letters to designate each of the four groups, then using rotation to allocate people to the different groups…We thought we were doing something completely new.  We wanted to muddle people up. In fact we succeeded in muddling ourselves up.  We didn’t always remember what the letters stood for.  None of us was a statistician, but we felt that the patulin trial was the first decently controlled trial the MRC had done. (Philip D’Arcy Hart, interview with Iain Chalmers, 2 May 2003)

The exceptional methodological quality of the patulin trial led the MRC to ask Philip D’Arcy Hart and Marc Daniels (Crofton 2005) in its Tuberculosis Research Unit and Austin Bradford Hill from its the Statistical Research Unit (Higgs 2000) to organise a multicentre trial of the then limited supplies of streptomycin for treating pulmonary tuberculosis (MRC 1948). The methodological advance manifested in the patulin and streptomycin trials was that they were designed to prevent the problem that had dogged the trial of serum treatment of pneumonia a decade earlier: steps were taken to prevent those involved in allocating people to the comparison groups knowing or predicting correctly which allocation was next in line. It is because this concealment of allocations was more likely to be achieved using random allocation than with alternation that allocation based on random numbers was used in the streptomycin trial (D’Arcy Hart 1999; Chalmers 2001; 2005; 2010a). As Richard Doll observed, randomization was introduced ‘’to control allocation biases, not for any esoteric statistical reason.’ (Doll 2000).

International recognition of the MRC’s role in developing the science of clinical trials

Was Stephen Lock justified in suggesting that ‘the randomised controlled trial is a British invention’ (Lock 1994)? The streptomycin trial was certainly not the first trial to use allocation based on random numbers (see, for example, Bell 1941; Chalmers 2010b), but it has become iconic and is widely seen as ushering in a new age of clinical trials. The report of the streptomycin trial deserves its iconic status because it is exceptionally clearly written and describes the measures taken to prevent foreknowledge of allocations.

The trial also heralded the beginning of a substantial programme of clinical trials addressing a wide variety of questions (Appendix), organised under the aegis of one funder, and exploiting the new opportunities created by creation of a National Health Service.  As far as I am aware there are no examples in other countries of comparable trial development programmes with these features (see, for example, Tucker 1960; Dowling 1975).

The MRC did not deliver the randomized controlled trial to the world fully formed. The trial of serum treatments for pneumonia (MRC 1934), and the patulin trial (MRC 1944) provided invaluable learning. Each multicentre trial was designed and run under the aegis of a steering committee, which almost always included a member of staff from the MRC Statistical Research Unit –  if not Austin Bradford Hill, then Peter Armitage, Richard Doll, John Knowelden, Donald Reid, or Ian Sutherland. However, the learning continued throughout the 1950s. The Appendix shows how key methodological aspects of the MRC’s multicentre trials were reported, with descriptions ranging from less than 30 words to more than 300 words in length, and the names of representatives of the Statistical Research Unit who were members of trial planning committees.  The quotations from the articles reveal that the distinction between random and alternate allocation, and the need to conceal allocation schedules from those making decisions about trial recruitment, were not always made as clear as they might have been.

An early example of international recognition of the MRC’s pioneering work in the design and management of clinical trials was Harvard University’s 1952 invitation to Bradford Hill to speak about ‘The Clinical Trial’, and the New England Journal of Medicine’s decision to publish his talk prominently (Hill 1952). International respect for Bradford Hill and other contributors to the MRC programme of clinical trials was made clear in 1959 when the Council for International Organizations of Medical Sciences (CIOMS), which had been established under the joint auspices of UNESCO and WHO, asked Bradford Hill to organise a conference on ‘Controlled Clinical Trials’. The meeting was held between 23 and 27 November 1959, in Vienna. Unfortunately CIOMS does not have any documents relating to the meeting (Sev Fluss, email to IC, 2 August 2013), so it is unclear why Vienna was selected, or who, apart from the speakers, attended.

All those presenting papers were British doctors and statisticians who, together, brought a wide range of practical experiences of clinical trials to the conference. The conference covered general issues, including ethics; aspects of design, management and analysis; trials of surgical as well as medical interventions; and exemplar trials in acute infections, pulmonary tuberculosis, rheumatoid arthritis, coronary thrombosis, and cancer. Given the methodological focus of this paper, Peter Armitage’s account of ‘The construction of comparable groups’ is of particular relevance (Armitage 1960).

The meeting generated an unexpectedly large demand for the background papers prepared for it, so, the following year, these were published as books in English  (Hill 1960) – Controlled Clinical Trials, and in French (Schwartz et al. 1960) – Les essais thérapeutiques cliniques. These books might reasonably be regarded as the earliest textbooks about clinical trials. Two years later, Bradford Hill drew on the practical experience that had been acquired between 1948 and 1960 in Statistical methods in clinical and preventive medicine (Hill 1962).  In that book he was clearer about what was needed to avoid allocation bias:

The appropriate cases having been accepted, they are allocated at random to one or another of the treatments under study – usually by the use of random sampling numbers…So that the observer may not be influenced in his decision as to whether or not a patient should be brought into the trial…it is sometimes wise to deny him any prior knowledge of the treatment which the patient will receive in the event of acceptance. (Hill 1962, pp 9-10).  

I hope I have shown that the MRC’s ‘Centenary Timeline’ is wrong to suggest that ‘MRC scientists developed the randomized controlled trial design between 1940 and 1949’. The MRC’s contribution is far more substantial than this implies, and it was independent of the statistical reasons for using randomization. It is high time that the Council made more of its achievement.

Acknowledgements

I am grateful to many people for help with this paper, particularly to Peter Armitage for sharing with me his firsthand knowledge of the growth of the Medical Research Council’s clinical trials programme after the Second World War, and to Linda Bryder, Mike Clarke, George Davey-Smith, Martin Edwards, Imogen Evans, Jeremy Howick, Stephen Lock, Scott Podolsky, Ben Toth, Ulrich Tröhler, and Jan Vandenbroucke for helpful comments on an earlier draft. Thanks also to Rebecca Brice for bibliographic help.

This James Lind Library commentary has been republished in the Journal of the Royal Society of Medicine 2013;106:498-509. Print PDF

References

Anderson T (1939). Sulphanilamide in the treatment of measles. BMJ 1:716-718.

Anderson T, Ferguson MS (1945). Comparative effect of sulphonamide and penicillin in pneumonia. Lancet 2:805-808.

Armitage P (1960). The construction of comparable groups. In: Hill AB. Controlled Clinical Trials. Oxford: Blackwell Scientific Publications, pp 14-18.

Armitage P (1992). Bradford Hill and the randomized controlled trial. Pharmaceutical Medicine 6:23-37.

Armitage P (2002). Randomisation and alternation: a note on Diehl et al. JLL Bulletin: Commentaries on the history of treatment evaluation (https://www.jameslindlibrary.org/articles/randomisation-and-alternation-a-note-on-diehl-et-al/).

Armstrong RR, Johnson RS (1932). Treatment of lobar pneumonia by anti-pneumococcal serum. BMJ 2:662-65.

Austoker J, Bryder L (1989). The National Institute for medical research and related activities of the MRC. In: Austoker J, Bryder L, eds. Historical perspectives on the role of the MRC. Oxford: Oxford University Press, p 35-57.

Bannerman WB (1905). Introduction. In: Bannerman WB (ed.) Serum-therapy of plague in India. Scientific Memoirs by Officers of the Medical and Sanitary Departments of the Government of India. Calcutta, India: Office of the Superintendent of Government Printing, pp 1-27.

Bell JA (1941). Pertussis prophylaxis with two doses of alum-precipitated vaccine. Public Health Reports 56:1535-1546.

Bray GW, Witts LJ (1934). Pseudo-ephedrine in asthma. Lancet 1:788-90.

British Medical Journal (1902). Report on the Indian Plague Commission. 1:1155-1161.

Bryder L (2010). The Medical Research Council and clinical trial methodologies before the 1940s: the failure to develop a ‘scientific’ approach. JLL Bulletin: Commentaries on the history of treatment evaluation (https://www.jameslindlibrary.org/articles/the-medical-research-council-and-clinical-trial-methodologies-before-the-1940s-the-failure-to-develop-a-scientific-approach/).

Chalmers I (1997). Assembling comparison groups to assess the effects of health care. J Roy Soc Med 90:379-386.

Chalmers I (1999). Why transition from alternation to randomisation in clinical trials was made. BMJ 319:1372.

Chalmers I (2001). Comparing like with like: some historical milestones in the evolution of methods to create unbiased comparison groups in therapeutic experiments. international Journal of Epidemiology 30:1170-1178.

Chalmers I (2005). Statistical theory was not the reason that randomisation was used in the British Medical Research Council’s clinical trial of streptomycin for pulmonary tuberculosis. In: Jorland G, Opinel A, Weisz G, eds. Body counts: medical quantification in historical and sociological perspectives. Montreal: McGill-Queens University Press, pp 309-334.

Chalmers I (2010a). Why the 1948 MRC trial of streptomycin used treatment allocation based on random numbers. JLL Bulletin: Commentaries on the history of treatment evaluation (https://www.jameslindlibrary.org/articles/why-the-1948-mrc-trial-of-streptomycin-used-treatment-allocation-based-on-random-numbers/).

Chalmers I (2010b). Joseph Asbury Bell and the birth of randomized trials. JLL Bulletin: Commentaries on the history of treatment evaluation (https://www.jameslindlibrary.org/articles/joseph-asbury-bell-and-the-birth-of-randomized-trials/).

Chalmers I, Clarke M (2004). The 1944 Patulin Trial: the first properly controlled multicentre trial conducted under the aegis of the British Medical Research Council. International Journal of Epidemiology 32:253-260.

Chalmers I, Dukan E, Podolsky SH, Davey Smith G (2011). The advent of fair treatment allocation schedules in clinical trials during the 19th and early 20th centuries. JLL Bulletin: Commentaries on the history of treatment evaluation (https://www.jameslindlibrary.org/articles/the-advent-of-fair-treatment-allocation-schedules-in-clinical-trials-during-the-19th-and-early-20th-centuries/)

Choksy NH (1900). Professor Lustig’s plague serum. Lancet 2:291-292.

Clarke M (2004). The 1944 patulin trial of the British Medical Research Council: an example of how concerted common purpose can get reliable answers to important questions very quickly. JLL Bulletin: Commentaries on the history of treatment evaluation (https://www.jameslindlibrary.org/articles/the-1944-patulin-trial-of-the-british-medical-research-council-an-example-of-how-concerted-common-purpose-can-get-reliable-answers-to-important-questions-very-quickly/).

Cox-Maximov D (1997). The making of the clinical trial in Britain, 1910-1945: expertise, the state and the public. PhD thesis, University of Cambridge.

Crofton J (2005). Marc Daniels (1907-1953). JLL Bulletin: Commentaries on the history of treatment evaluation (https://www.jameslindlibrary.org/articles/marc-daniels-1907-1953-a-pioneer-in-establishing-standards-for-clinical-trial-methods-and-reporting/).

D’Arcy Hart P (1999). A change in scientific approach: from alternation to randomised allocation in clinical trials in the 1940s. BMJ 1999; 319: 572-573.

Diehl HS, Baker AB, Cowan DW (1938). Cold vaccines: an evaluation based on a controlled study. Journal of American Medical Association 111:1168-1173.

Doll R (2000). The role of data monitoring committees. In: Duley L, Farrell B, eds. Clinical trials. London: BMJ Books, p 97.

Dowling H (1975). Emergence of the cooperative clinical trial. Transactions and studies of the College of Physicians of Philadelphia 43:20-29.

Edwards MV (2004). Control and the therapeutic trial, 1918-1948. MD thesis, University of London.

Edwards MV (2007). Control and the therapeutic trial: rhetoric and experimentation in Britain, 1918-48. Wellcome Series in the History of Medicine. Amsterdam: Rodopi.

Haffkine WM (1900). On preventive inoculation. Proceedings of the Royal Society of London 65:252-271

Higgs E (2000). Medical statistics, patronage and the state: the development of the MRC Statistical Unit, 1911-1948. Medical History 44:323-340.

Hill AB (1933). Medical Research Council 1487, VI: A. Serum treatment of pneumonia. 22 December 1933. Cited in: Austoker J, Bryder L. The National Institute for medical research and related activities of the MRC. In: Austoker J, Bryder L, eds. Historical perspectives on the role of the MRC. Oxford: Oxford University Press, 1989:35-57.

Hill AB (1937). Principles of medical statistics. London: Lancet, p 5.

Hill AB (1946). Principles of medical statistics. London: Lancet, p 7.

Hill A (1952). The clinical trial. New England Journal of Medicine 247:113-119.

Hill AB (1960). Controlled Clinical Trials. Oxford: Blackwell Scientific Publications.

Hill AB (1962). Statistical methods in clinical and preventive medicine. Edinburgh and London: Livingstone. 

Hill AB (1990). Memories of the British streptomycin trial in tuberculosis: the first randomized clinical trial. Controlled Clinical Trials 11:77-9.

Landsborough Thomson A (1975). Half a century of medical research, Vol. 2: The programme of the Medical Research Council (UK).  London: Her Majesty’s Stationery Office, 236-241.

Lewis T (1930). Angina pectoris associated with high blood pressure and its relief by amyl nitrate. Heart 14:305-327.

Lock S (1994). The randomised controlled trial – a British invention. In: Lawrence G, ed. Technologies of modern medicine. London: Science Museum, pp 81-87.

Medical Research Council (1934). The serum treatment of lobar pneumonia. BMJ 1:241-45.

Medical Research Council (1944). Clinical trial of patulin in the common cold. Lancet 2:373-5.

Medical Research Council (1948). Streptomycin treatment of pulmonary tuberculosis: BMJ 2:769-82.

Physicians to the Royal Infirmary, Edinburgh (1930). A report on lobar pneumonia: treatment by concentrated antiserum. Lancet 2:1390-1394.

Podolsky SH (2006). Pneumonia before antibiotics: therapeutic evolution and evaluation in twentieth-century America. Baltimore: Johns Hopkins University Press.

Podolsky SH (2008). Jesse Bullowa, specific treatment for pneumonia, and the development of the controlled clinical trial. JLL Bulletin: Commentaries on the history of treatment evaluation (https://www.jameslindlibrary.org/articles/jesse-bullowa-specific-treatment-for-pneumonia-and-the-development-of-the-controlled-clinical-trial/).

Reynolds LA, Tansey EM (2000). Clinical research in Britain, 1950–1980. Wellcome Witnesses to Twentieth Century Medicine, Vol. 7. London: The Wellcome Trust.

Scadding JG (2000). In: Reynolds LA, Tansey EM. Eds. (2000) Clinical research in Britain, 1950–1980. Wellcome Witnesses to Twentieth Century Medicine, Vol. 7. London: The Wellcome Trust, p 97.

Schwartz D, Flamant R, Lellouch J, Rouquette C (1960). Les essais thérapeutiques cliniques: méthode scientifique d’appréciation d’un traitement. Paris: Masson.

Snodgrass WR, Anderson T (1937a). Prontosil in the treatment of erysipelas: a controlled series of 312 cases. BMJ 2:101-104.

Snodgrass WR, Anderson T (1937b). Sulphanilamide in the treatment of erysipelas: a controlled series of 270 cases. BMJ 2:1156-9.

Toth B (1998). Clinical trials in British medicine 1848-1948, with special reference to the development of the randomised controlled trial. PhD thesis, University of Bristol.

Tucker WB (1960). The evolution of the cooperative studies in the chemotherapy of tuberculosis of the Veterans’ Administration and Armed Forces of the U.S.A. Advances in Tuberculosis Research 10:28.

Vandenbroucke JP (1987). A short note on the history of the randomized controlled trial. Journal of Chronic Diseases 40:985-987.

Williams KJ (2005). British pharmaceutical industry, synthetic drug manufacture and the clinical testing of novel drugs 1895-1939. PhD thesis, University of Manchester.

Worboys M (1993). Treatments for pneumonia in Britain 1910-1940. In: Löwy I, ed. Medicine and change: historical and sociological studies in medical innovation. Paris: Les Editions INSERM, pp 317-335.

Appendix:

MRC multicentre controlled trials, arranged by year of first substantive report (1944-1960), with text describing methods used to reduce allocation bias

Patulin for the common cold
“RecordingPatients were seen by the MO before receiving treatment from the factory nurse or sick-bay attendant. The MO remained ignorant which of the test solutions the patient received and neither the MO nor the nurse knew which contained patulin. At the first attendance the MO filled in the record card and satisfied himself that the patient was in fact suffering from the common cold and not from some condition such as hay fever or “chronic catarrh”.He then detached the counterfoil and gave it to the patient who took it to the nurse in an adjoining room.  The nurse gave out the test solutions in strict rotation, each patient receiving his own bottle of solution from which all treatments were given. The distinguishing letter of the particular solution given was ringed by the nurse on the patient’s counterfoil.  The record sheet and counterfoil were filed separately by the MO and nurse respectively.’ [Lancet 1944;2:373-375] MRC SRU not represented on planning committee.
Streptomycin for pulmonary tuberculosis
“The control scheme: Determination of whether a patient would be treated by streptomycin and bed rest (S case) or by bed rest alone (C case) was made by reference to a statistical series based on random sampling numbers drawn up for each sex at each centre by Professor Bradford Hill; the details of the series were unknown to any of the investigators or to the coordinator and were contained in a set of sealed envelopes, each bearing on the outside only the name of the hospital and a number.’ [BMJ 1948;2:769-782] Hill on planning committee.
Vaccine prevention of whooping-cough
‘Allocation of Children to the vaccinated and Unvaccinated Groups: On receipt of signed consent forms from the parents, children with no previous history of pertussis or of previous inoculation with pertussis vaccine were classified by sex and placed in one of the following age groups: 6-8, 9-11, 12-14, and 15-17 months; a few children were accepted just after they had reached the age of 18 months.  For each age and sex group, sheets were previously drawn up on which vaccine letters A, B, C, and D in random order were repeated a sufficient number of times to deal with all expected volunteers in the appropriate age and sex group.  As each child’s name was received it was written in the first vacant space on the appropriate sheet, and the vaccine letter opposite the child’s name determined what it should receive and was inserted on the child’s record card. [BMJ 1951;1:1463-1471]

‘Children in the 1948-51 trials were allocated to a vaccine group in succession as their names were received. In the 1951-4 trials…children born on odd days of the month were allocated to one group and children born on even days to the other.’ [BMJ 1956;2:454-62]. Armitage, Hill and Knowelden on planning committee.
Antibiotic treatment for whooping cough
‘The order in which patients were allocated to the treatments was determined by placing the nine letters in a randomly determined sequence. A separate sequence was constructed for each series of patients in each age and sex group.’ [BMJ 1951;1:1463-1471]. Hill on planning committee.
Antihistaminic drugs for the common cold
‘Whether they were to be given the drug or the dummy was decided randomly, and neither the patients nor the clinical observer knew which any particular patient had received’. [BMJ 1950;2:425-431] Hill on planning committee.
Intermittent streptomycin for pulmonary tuberculosis
‘’…the same mechanism as in the first Medical Research Council series was used for recruiting and for allotting them at random to the different treatment groups.’ [BMJ 1950;1:1224-1230] Armitage on planning committee.
Streptomycin and PAS for pulmonary tuberculosis
‘After selection by the panel, the determination of the treatment group (P, S, or SP) in each case was made by reference to a randomly constructed list (based upon random sampling numbers) held confidentially in the Tuberculosis Research Unit.’ [BMJ 1950;2:1073-1085] Hill on planning committee.
Antibiotics for pneumonia
Allotment to Treatment Groups – For each hospital three series of cards were made up, each containing an equal number bearing the words “aureomycin”, “chloramphenicol”, or “standard”. These cards were put in blank envelopes, placed in random order, and then given serial numbers. When it had been decided on admission that a patient had pneumonia the next envelope was opened, and in this way the treatment group was determined.’ [BMJ 1951;2:1361-1365] Knowelden on planning committee.
Combined chemotherapy for preventing streptomycin resistance
‘After acceptance by the panel the determination of the treatment group in each case was made by reference to a list (based on random sampling numbers) held confidentially in the Tuberculosis Research Unit.’ [BMJ 1952;1:1157-1162] Hill on planning committee.
Isoniazid for pulmonary tuberculosis
‘After acceptance for the trial, the determination of the treatment for each case was made by reference to prearranged lists based on random sampling numbers (drawn up in the Statistical Research Unit for each of the subgroups mentioned later and for each hospital); for some of the large hospitals, series of sealed envelopes were prepared by the Tuberculosis Research Unit from these lists, so that the treatment allocated could be determined on the spot; for the others…the treatment was indicated from lists held confidentially by the Tuberculosis Research Unit.’ [BMJ 1952;2:735-746] Hill on planning committee.
Isoniazid with streptomycin or PAS for pulmonary tuberculosis
‘Every other patient has been allocated at random to one of the four treatment series…The number of patients allocated to the S2H series is not intentionally larger than the number in the 10PH series; the difference has arisen by chance as a result of the operation of random allocation in the 50 hospitals’. [BMJ 1953;2:1005-1014]. Hill on planning committee.
Vaccine prevention of influenza
‘Some 1,200 volunteers in 11 universities and 30 hospitals thus obtained were at each centre inoculated with the vaccines in serial order, so that approximately equal numbers in each centre received one of the vaccines’. [BMJ 1953;2:1173-1177]. Sutherland on planning committee.
Treatments for infantile diarrhoea and vomiting.
‘Throughout each trial successive cases in each age and severity group were allocated in rotation to the appropriate treatment groups…Minor differences in the numbers of cases appearing in concurrent treatment groups were due partly to errors of allocation in the early stages of the investigation and partly to subsequent exclusions…at two centres trials were conducted “blind” – i.e., without the physicians knowing which patients were receiving the trial drug’. [Lancet 1953;2:1163-1169]  Doll on planning committee.
Cortisone and aspirin for early rheumatoid arthritis
‘Allocation to Treatment: …The clinician at the centre accepted a patient as suitable on the basis of the criteria laid down. Having done so he applied to a central office to know whether the treatment of that patient should be with cortisone or aspirin. At the central office a register had been constructed showing the order in which the treatments were to be applied. It was held by one person. Such a random order of treatments was constructed separately for each of 48 small subgroups of patients…Within each of these small subgroups random sampling numbers were used to give approximately equal numbers of patients on cortisone and aspirin….In detail, the technique was that on admitting a patient a treatment centre would send particulars of the sex, age and duration of illness to the holder of the register. The prepared list for a person of that sex, age, and duration would be consulted, the patient’s name entered on the next available line, and the nature of the treatment already inscribed on that line would, with supplies of hormone and aspirin, be sent to the treatment centre.’ [BMJ 1954;1:1223-1227]. Hill on planning committee.
Hormones for diabetes in pregnancy
‘…nine centres were asked to cooperate in a centrally planned and controlled investigation…patients were allocated at random to “hormone-treated” (H) and “non-hormone-treated” (NH) groups…Within each age-parity group, patients were allocated at random to hormone and non-hormone series. Because of the small numbers involved in any one group, equality between the two series was ensured by allocating the first of the patients to be presented at random to one or other form of treatment, the second to the opposite form and so in pairs… When accepted and allocated, patients were give a series number which was attached to the clinic record-sheet and to every bottle of tablets required by each patient throughout her pregnancy.  Neither patient nor clinician knew the type of preparation given, and the tablets, active and inert, were identical in size, appearance, taste and packing.’ [Lancet 1955;2:833-836] Reid on planning committee.
ACTH, cortisone and aspirin for acute rheumatic fever
‘Allocation of Patients to Treatment: Patients on admission to the study were divided into two age groups…and into three groups according to the length of time in each case between the date of onset of the attack and the date at which therapy began…For each of these three duration groups, within each age group, and separately for each centre, the three treatments, A.C.T.H., cortisone, and aspirin, were listed in random order for as many patients were as likely to be admitted, using random sampling numbers and keeping the numbers of patients on the three treatments approximately equal in each centre.  The co-ordinating centre in each country issued serially numbered and sealed envelopes to the treatment centres. Thus, on admission of a patient of given age and specified duration-from-onset group, the investigator at the treatment centre had merely to open the next available envelope for that particular group to find a statement of the treatments to be applied. He was therefore unable to predict the treatment for his next case. The allocation was both “blind” and random. [In a few centres, for varying durations of time, the investigators wished to withhold some patients for other studies. In these instances a pre-determined proportion of the envelopes contained, as an interaction, “free case”. Such exclusions, therefore, were also “blind” and randomly determined, and could not bias the group brought into the study]…‘The admission report on the patient, and the assignment envelope, were then sent to the co-ordinating centre. If for any special reason the investigator decided in advance that a patient fulfilling the criteria should not be admitted to the study, no envelope was opened. In every such case, however, an admission report was required, together with an explanation as to why the patient had not been admitted. Six such cases were reported in the U.S.; none in the U.K.’ [BMJ 1955;1:555-574] Hill on planning committee.
Cortisone for chronic asthma
Allocation to Treatment: The two treatment groups were constituted by random allocation. Patients in one group received cortisone acetate and in the other group placebo tablets.’ [Lancet 1956;2:798-803]. MRC SRU not represented on planning committee.
Preventing thrombophlebitis after intravenous infusions
‘Seven hospitals took part in the trial. Several methods were used to ensure random use of a rubber or plastic set.  Thus in one hospital, the choice of set depended on the day of admission, in other hospitals it depended on whether the patient’s hospital number was odd or even, and in others the type of set was alternated between consecutive cases.’ [Lancet 1957;1:595-597] Armitage on planning committee.
Vaccine prevention of poliomyelitis
‘…it was impossible to adopt the ideal of a controlled trial in which children were randomized to two groups, one to receive vaccine and the other placebo injections. It was, however, anticipated that the demand for vaccine would be in excess of the supply. If this were so it was intended to compare the incidence of poliomyelitis in those who received vaccine with those who wanted it but for whom none was available.  As a form of random allocation the selection of children to be vaccinated was to be made according to month of birth.’ [BMJ 1:1957;1271-1277] Hill and Knowelden on planning committee.
Cortisone and prednisone for rheumatoid arthritis
Allocation of Treatment: At each participating centre pairs of patients were matched as closely as possible for age, sex, duration of disease, and duration of previous cortisone therapy. One of each pair, selected randomly, continued to receive cortisone acetate; the other was changed to prednisone acetate.’ [BMJ 1957;2:199-202] Hill on planning committee.
Antibiotics for severe bronchiectasis
‘The difficulties in a trial of this character will be appreciated, not least of them being the subjective improvement which might be expected when more intense interest is shown.  This difficulty was met by introducing a control medicament, lactose, and making the trial a “blind” one in which neither the clinician nor the patient knew which drug was being given…If [the patient] satisfied the criteria for admission he was allocated at random, according to a centrally held list, to one of the three treatment groups – penicillin, oxytetracycline, or lactose.’  [BMJ 1957;2:255-259] Knowelden on planning committee.
BCG vaccine for preventing tuberculosis
‘…tuberculin tests were done on 6,405 children aged 13 and 14 years, and those who were tuberculin-negative were divided by an effectively random process into two groups of similar size.’ [BMJ 1958;1:79-83] Sutherland on planning committee.
Influenza vaccine for chronic bronchitis
‘Allocation of each patient to one or other vaccine was made by means of a random order list…Neither the patients nor their doctors knew the identity of the vaccines given.’ [BMJ 1959;2:905-908] MRC SRU not represented on planning committee.
Long-term anticoagulation after cardiac infarction
‘Allocation of Patients: Prognosis after infarction is generally believed to depend in part on the number of previous attacks in the same patient. Allocation to treatment regimes was therefore based on a primary division of suitable patients according to the number of previous infarcts; and a random distribution of the patients in each centre within these infarct groups was made to high or low dosage levels.  For each centre, batches of sealed envelopes were prepared, marked according to the previous infarct history grouping, and enclosing a card giving the dosage level.  The order of the allocation was randomly determined. When the clinician in charge at any centre considered that a patient was suitable for inclusion, and had classified him according to his previous infarction history, he obtained the treatment group allocation by opening the next in the appropriate batch of envelopes.’ [BMJ 1959;1:803-810] Reid on planning committee.