At the end of the 1950s, the Council for International Organizations of Medical Sciences (CIOMS) organised a meeting under the joint auspices of the UN Educational, Scientific and Cultural Organisation (UNESCO) and the World Health Organization (WHO), “to discuss the principles, organization and scope of “controlled clinical trials”, which must be carried out if new methods or preparations used for the treatment of disease are to be accurately assessed clinically”. The meeting took place in Vienna between 23 and 27 March 1959, under the chairmanship of Austin Bradford Hill, director of the Medical Research Council’s Statistical Research Unit. Grants supporting the meeting were made available by the Wellcome Trust and the Oesterreichische Bundesministerium fuer Soziale Verwaltung.
The Executive Secretary of CIOMS (JF Delafresnaye) explained how the conference had been organised:
The conference was in itself an experiment. The meeting was a closed one, and only one hundred participants were invited. One national group, the British, was charged with the task of presenting each topic to be studied. In this way, all papers were coordinated in London by Professor Bradford Hill so that overlap was avoided and ample time allowed for discussion (‘J.F.D.’, in Hill 1960 , p vii).
Austin Bradford Hill arranged for 23 papers to be presented by British statisticians, physicians and a surgeon at the meeting (Hill 1960). Eleven of the 23 papers were presented by members of the Medical Research Council’s Statistical Research Unit. In addition to Austin Bradford Hill, these were Peter Armitage, Richard Doll, John Knowelden, Donald Reid and Ian Sutherland.
Which aspects of clinical trials were covered at the meeting and in the book?
The proceedings painted a clear picture of the British concept of the controlled clinical trial, and the place that it held in medicine. The proceedings were timely because WHO had recently decided to undertake research. The text began with the ethics of controlled trials, the construction of comparable groups, and the criteria for diagnosis and assessment [pp 3-28]. Parts two to four dealt respectively with clinical trials for acute infections, pulmonary tuberculosis, and rheumatoid arthritis. Ian Sutherland described the treatment of pulmonary tuberculosis as having been the most intensively studied in post-war controlled clinical trials so that, by 1959, three major drugs (isoniazid, streptomycin and para-amino-salicylic acid), and others less successful, had been investigated:
alone and in combination, in varied dosages and rhythms, in courses of different lengths, and in different environmental circumstances, in many countries throughout the world. As a consequence, the statistical requirements for such trials are reasonably well understood, although they are unfortunately not always well applied.
Five percent of patients were already recognized as infected with bacilli resistant to one or more of the three commonly used drugs. George Pickering (p 165) lauded the tuberculosis work as: “done without any waste of time and without unnecessary waste of human life or suffering”.
Ian Sutherland noted (p 48) that the number of patients needed in a trial depends not only on the inherent variability of the disease-course in individual patients but also on the expected difference between the average response to the new and the established treatment. He observed that if an established treatment is highly effective, very large numbers of patients may be needed to show with reasonable certainty that another drug is more effective still:
…because there is so little scope left for further advances, so that the superiority of the new treatment – if it is superior – can only be slight. In these circumstances we are usually content to demonstrate, on rather smaller totals of patients, that the treatments are (or are not) of the same high order of effectiveness.
Equivalence trials, which require us to make precise what we shall be content to label as being “of the same high order of effectiveness” were not fully worked out in 1959, but Austin Bradford Hill (p 168) was sage on the matter of trial size and surrogate endpoints:
In some fields one very obvious difficulty will arise. It is not difficult to show that a new and highly potent drug will reduce the death rate of patients from, say, 30 per cent to 10 per cent. The change is pronounced. We can hardly miss it. But, in the modern world, with one potent drug following another, the problem is to know whether a new one will reduce that 10 per cent to 7 per cent. That small difference in mortality is likely to be regarded as important by at least 3 per cent of patients. Without a very large scale and meticulously conducted trial it may be impossible to detect it. Indeed it may be impossible to prove such marginal differences by any means.
Perhaps the answer lies in sharpening up our means of assessment – that death or recovery is too crude. We may perhaps need to make more resort to the laboratory for measurements of the patient’s reactions. But . . . If the sedimentation rate falls, the pulse is steady, and the blood pressure impeccable, we are still not much better off if unfortunately the patient dies. The clinical judgement of the whole is, or may be, still fundamental.
Other designs in controlled clinical trials, as described by Donald Reid (p 87), concerned the patient as his own control, also the concurrent assessment of several treatments, including by factorial designs (2×2 or 2x2x2) as advocated by Doll (p 94), and the sequential approach of Peter Armitage (p 100) – which, despite reservation by Lawrence Witts (p 11), had a great future according to Austin Bradford Hill (p 169). Sequential boundaries were illustrated by Peter Armitage’s then recently-published work with E.S. Snell (Snell and Armitage 1957) on a randomized within-patient comparison of three cough linctuses: placebo, pholcodeine and heroin!
There followed descriptions of coronary thrombosis and cancer trials with ‘intention to manage’ clearly enunciated by Ralston Paterson (p 125), see also John Knowelden (p 156 versus p 34) and Peter Armitage (p 18), but Hedley Atkins (p 135) considered it neither tactful nor kindly to explain to a dying patient that their treatment allocation had been determined by the toss of a coin. Papers by Donald Reid – who described what was done and why (p 109) and George Pickering (who summarized what happened) – included detailed outcome tables for 188 patients who, after myocardial infarction, were randomized to receive 1mg phenindione a day (low dose: of whom 31 died) versus 195 to receive enough phenindione to maintain prothrombin time at 2 to 2.5 times the normal (high dose: of whom 22 died). George Pickering summed up their team-spirit (p 163):
What we have been reporting is not my work: it is not Dr Reid’s work; it is the work of a team of which we happened to be the officers. One of the unforeseen results of the trial was a remarkable camaraderie between prima donnas. I think we all learned a lot, about myocardial infarction and about one another.
George Pickering the physician lauded Austin Bradford Hill the statistician:
It is not only that you have been able to do the sums, Sir, you and your collaborators, but your most important contribution has been to make sure that you understood what the clinicians were talking about, and if you did, then probably they understood it too.
He also gave a half-remembered account of Pasteur’s veterinary “trial” of inoculation against anthrax (p 166).
In his concluding remarks George Pickering addressed the question of when and how clinical trials should be conducted (if one is in reasonable doubt about the effectiveness of a treatment, p 166), noting that there are still some old treatments which are in dispute:
As one goes from clinic to clinic, some believe that the patients should be treated one way, some believe that the patients should be treated in another way, and still others take a different view. Quite clearly all these doctors cannot be right, and I see no method of settling this problem except by a controlled clinical trial.
Half a century later, try substituting court for clinic, offender for patient, and judge for doctor (Bird 2004; Turner et al. 2013). Presciently, George Pickering also noted (p 163):
The vast amount of effort that everyone puts into these controlled trials earns them no place at the top of a paper. Therefore in some countries this might be regarded as a hindrance to the academic and professional progress of those taking part in a trial. That, I think, is a very important thing to be recognised.
The final part before pithy physicianly and statisticianly conclusions dealt with the organization of controlled clinical trials (D’Arcy Hart, p 145-), the design of records, and follow-up (Ian Sutherland, p151-). Sutherland advised, inter alia, on index cards per patient to be kept in order of study-date and that different (but restful) colours for different forms are sometimes helpful. The analysis and presentation of results were described by John Knowelden (p 155-) who mooted that, although a single author cannot please all his collaborators, he (or she) is “more likely to produce a coherent document than would a group, and is best placed to ensure that text and tables are consistent throughout in the story they tell”. Good advice that stands the test of time.
The exposition of the British concept of the controlled clinical trial is astonishing for just how much had been got right within barely two decades: balanced randomization and stratification; the need to inform the referring doctor of patients included in a therapeutic trial; the universality of observer variation, as described by Charles Fletcher (pp 19-28); and reservations about paired-patient designs (p 51). Particularly striking was the marriage of practicality to principled thinking as in: reservation about whether a controlled trial is advisable when the results can be expected a priori to be meagre (Witts p 9); no method of allocation being so ethical as a controlled trial if there are only small supplies of a new drug; preparedness to defend therapeutic trials in the law courts but never take payment for their conduct (p 13); recommendation to eschew prisoner-participants but admit military recruits (p 38), as their consent was considered to be freely given; avoid trials of convenience – which exploit a local opportunity for researching possible advantages which the local patients cannot hope to share in.
Guy Scadding even anticipated network meta-analysis by his advocacy of ‘linked trials’ (p 52), and John Crofton, surprised by patients’ non-compliance with chemotherapy, wrote: “Had we known then what we know now, we would have arranged for the routine testing of patients’ urines at regular intervals . . . “ (p 57). Philip D’Arcy Hart described surprise visits to the homes of patients, both for testing urines and for checking their drug stores, as the only satisfactory check on compliance (p 149). In his support of factorial designs, Richard Doll recounted an old Roman maxim that: “To do two things at once is to do neither” but Doll countered that ancient Rome was unaware of the powerful effects of random allocation (p 94).
In his summing up, Austin Bradford Hill (p 168) recounted Frank Green’s aphorism – that the statistician should be treated as an obstetrician and not as a morbid anatomist: in a trial from the very first and active in it throughout, co-equal and co-eternal with bacteriologists, pathologists and even clinicians.
How did the meeting come to be reported?
The Executive Secretary of CIOMS explained how the proceedings of the meeting came to be published, in French as well as in English:
Originally we did not intend to publish the proceedings in English as the literature on the subject is already extensive. It was felt, however, that a report of the conference in French would be desirable, and it is now in preparation [Schwartz et al. 1960]. But the number of requests for the working documents from many countries was so great that we decided to publish in full the introductory papers in mimeographed form. This limited edition was soon exhausted, and in view of the continued demand we have decided to bring out a printed version…
…The success of conference is due to the British group so ably led by Professor A. Bradford Hill. May they find here the expression of our gratitude (‘J.F.D.’, in Hill 1960, p vii).
In 1960, the proceedings were published in English by Blackwell under the title of Controlled Clinical Trials (Hill 1960), and in French by Masson under the title Les essais thérapeutiques cliniques: méthode scientifique d’appréciation d’un traitement (Schwartz et al. 1960). Despite the clear historical importance of the meeting, CIOMS has apparently no unpublished material relating to it (Chalmers 2013). Peter Armitage, the only surviving contributor to the meeting, remembers that he and others were told by Austin Bradford Hill, without much prior discussion, what was expected of each of them. He does not recall anyone from North America or Australasia attending the meeting, but remembers that Paul Martini, probably the most methodologically sophisticated German of the pre-WW2 era (Martini 1932; Stoll 2004; 2010), expressed some scepticism about parallel group randomized trials. Martini came later to retract his opposition to randomization and a few years later organized an international conference in Berlin, at which he expressed warm support for this approach (Peter Armitage, personal communication to Iain Chalmers & Sheila Bird). Martini’s co-organizers for the 23-26 October 1961 Berlin conference, Versuchsplanung in der klinische Medizin (Oberhoffer 1962), were Otto Nacke and Hubert Pipberger (Director of the Special Research Program in Medical Electronic Data Processing at the Veterans’ Administration Hospital, Washington DC). They were both founder members of the Deutsche Gesellschaft fur Medizinische Informatik, Biometrie und Epidemiologie (German Society for Medical Information, Biometrics and Epidemiology) and subsequently editors for Methods of Information in Medicine, which was founded in 1962 (McCray et al. 2011).
As pointed out by Iain Chalmers (2013), it is surprising that the Medical Research Council, even in its centenary year, failed to celebrate its substantial contribution to the history of clinical trials during the 1940s and 1950s, as reflected in the 1959 Vienna meeting. See, however, the Medical Research Council’s Centenary time-line (http://www.centenary.mrc.ac.uk/timeline/) for 1940-49. The absence of a substantive biography of Austin Bradford Hill is also surprising.
I thank Iain Chalmers for drawing my attention to this gem in the bibliography of the MRC Biostatistics Unit.
This James Lind Library article has been republished in the Journal of the Royal Society of Medicine 2015;108:372-375. Print PDF
Bird SM (2004). Prescribing sentence: time for evidence-based justice. Lancet 364:1457-1459.
Bird SM, Goldacre B, Strang J (2011). We should push for evidence based sentencing in criminal justice. British Medical Journal 341:612 (d612. doi: 10.1136/bmj.d612).
Chalmers I (2013). UK Medical Research Council and multicentre clinical trials: from a damning report to international recognition. JLL Bulletin: Commentaries on the history of treatment evaluation (http://www.jameslindlibrary.org/articles/uk-medical-research-council-and-multicentre-clinical-trials-from-a-damning-report-to-international-recognition/).
Hill AB (1960). Controlled clinical trials. Oxford : Oxford University Press.
Martini P (1932). Methodenlehre der Therapeutischen Untersuchung [Methodological principles for therapeutic investigations]. Berlin: Springer.
McCray AT, Gefeller O, Aronsky D, Leong TY, Sarkar IN, bergemann D, Lindberg DAB, van Bemmel JH, Haux R. The birth and evolution of a discipline devoted to information in biomedicine and health care. Methods of Information in Medicine 2011; 50: 491 – 507.
Oberhoffer G (1962). Bericht über das Internationale Seminar für medizinishe Dokumentation und Statistik, Berlin, 16-28 Oktober 1961. Methods of Information in Medicine 1:27-31.
Schwartz D, Flamant R, Lellouch J, Rouquette C (1960). Les essais thérapeutiques cliniques: méthode scientifique d’appréciation d’un traitement. Paris: Masson.
Snell ES, Armitage P (1957). Clinical comparison of diamorphine and pholocodine as cough suppressants by a new method of sequential analysis. Lancet 272: 860-862.
Stoll S (2004). Paul Martini’s Methodology of therapeutic investigation. JLL Bulletin: Commentaries on the history of treatment evaluation (http://www.jameslindlibrary.org/articles/paul-martinis-methodology-of-therapeutic-investigation/).
Stoll S (2010). Paul Franz Xavier Martini (1889-1964). JLL Bulletin: Commentaries on the history of treatment evaluation (http://www.jameslindlibrary.org/articles/paul-franz-xavier-martini-1889-1964/).
Turner RM, Bird SM, Higgins JPT (2013). The impact of study size on meta-analyses: examination of underpowered studies in Cochrane reviews. PLoS ONE8:e59202.