Chalmers I, Dukan E, Podolsky SH, Davey Smith G (2011). The advent of fair treatment allocation schedules in clinical trials during the 19th and early 20th centuries.
Email to someoneTweet about this on TwitterShare on FacebookPin on PinterestShare on LinkedIn

© Iain Chalmers, James Lind Initiative, Summertown Pavilion, Middle Way, Oxford OX2 7LG. Email: ichalmers@jameslind.net


Cite as: Chalmers I, Dukan E, Podolsky SH, Davey Smith G (2011). The advent of fair treatment allocation schedules in clinical trials during the 19th and early 20th centuries. JLL Bulletin: Commentaries on the history of treatment evaluation (http://www.jameslindlibrary.org/articles/the-advent-of-fair-treatment-allocation-schedules-in-clinical-trials-during-the-19th-and-early-20th-centuries/)


Introduction

The detailed and exceptionally clear 1948 report of the British Medical Research Council’s randomised trial of streptomycin for pulmonary tuberculosis is rightly regarded as a landmark in the history of clinical trials (MRC 1948). Of crucial importance, it describes how a treatment allocation schedule (based on random number tables) was concealed, thus preventing foreknowledge of allocations among those making decisions about patient participation (Chalmers 2005; 2010).

Although the report of the streptomycin trial is rightly iconic, the attention it has attracted has led many historians to overlook earlier evidence relevant to the evolution of unbiased prospective allocation of patients to treatment comparison groups. This has led some of them to assume that random allocation to treatment comparison groups reflected the development of statistical theory by RA Fisher (Chalmers 2010; Cox 2009) . In fact, for half a century before the MRC trial and Fisher’s writings, some medical practitioners wishing to evaluate the effects of treatments had used alternate allocation to assemble similar groups of patients, and so ensure that like would be compared with like. And these developments reflected an even earlier history during which some clinicians and others began to conceptualise what was needed for tests of treatments to be fair (Tröhler 2010a; Kaptchuk 2011; Huth 2006a).

Appreciation of the need to compare like with like

More than a millennium ago, some clinicians appreciated that comparisons are needed to arrive at causal inferences about the effects of medical treatments. In the 9th century CE, the Persian physician Al-Razi (Rhazes) explained why he recommended that bloodletting be used to treat the symptoms of meningitis:

…I once saved one group [of patients] by it, while I intentionally neglected [to bleed] another group. By doing that, I wished to reach a conclusion. (al-Razi 10th century CE;Tibi 2005).

Other people recognised centuries ago that, if treatment comparisons were going to be fair, like must be compared with like. Francisco Petrarch, in a letter to a fellow poet, wrote in 1364:

I once heard a physician of great renown among us express himself in the following terms……. I solemnly affirm and believe, if a hundred or a thousand men of the same age, same temperament and habits, together with the same surroundings, were attacked at the same time by the same disease, that if one half followed the prescriptions of the doctors of the variety of those practising at the present day, and that the other half took no medicine but relied on Nature’s instincts, I have no doubt as to which half would escape (Petrarch 1364). [emphasis added]

(One assumes that the poet predicted that the reputation of the medical profession would not be enhanced by the fair comparison he was proposing!)

The writings of several medical researchers in the 18th century make clear that some of them appreciated the importance of comparing like with like in treatment comparisons (Tröhler 2010a) . Isaac Massey, for example, challenging claims that inoculation was associated with much lower mortality than natural smallpox, observed that:

…to form a just comparison and calculate right in this case, the circumstances of the patients, must and ought to be as near as may be on a par (Massey 1723) [emphasis added]

And James Lind, in his account of a comparison of six different treatments for scurvy, was careful to note that factors other than the treatments were similar in the patients in his comparison groups:

  “….I took twelve patients in the scurvy… Their cases were as similar as I could have them. They all in general had putrid gums, the spots and lassitude, with weakness of their knees. They lay together in one place, being a proper apartment for the sick in the fore-hold; and had one diet common to all. (Lind 1753) [emphasis added]

Introduction of methods to ensure that like will be compared with like

Methods to ensure that like will be compared with like in fair treatment comparisons were proposed at least as early as the 17th century. Reflecting a time-honoured device for ensuring fairness (Silverman and Chalmers 2002), van Helmont (1648) proposed casting lots to decide which patients should be assigned to orthodox physicians (to be bled and purged), and which to his own, alternative treatments.  A decade later, Starkey (1657) also proposed a controlled trial using a different approach to creating similar treatment comparison groups (Donaldson 2016). And a century later, Anton Mesmer challenged his orthodox physician detractors to cast lots to decide which patients should be treated by them, and which by him, using ‘animal magnetism’:

In order to avoid any later argument and all the questions that could be raised about differences in age, in temperament, in diseases, in their symptoms etc. the assignment of the patients shall be made by the method of lots (Mesmer 1781).

Casting lots is just one of several potentially unbiased methods that can be used to ensure that like will be compared with like in treatment comparisons. Alternation (or rotation) of successive patients to different treatments is an easily understood way of generating patient groups for fair treatment comparisons. As long as the underlying order of the patients’ presentation has not been predetermined in some way that introduces bias, strict alternation ensures that no conscious or unconscious bias results in patients with better or worse prognoses being allocated to one of the treatment comparison groups. Other methods that have been used to ensure that like will be compared with like include patients’ dates of birth, or the terminal digits of their case record numbers.

Some accounts of the use of unbiased treatment allocation appear early in the 19th century. In his 1816 Edinburgh doctoral thesis, Alexander Lesassier Hamilton reports having used rotation to allocate sick soldiers to different treatments at a base hospital in Elvas during the Peninsular War (Lesassier Hamilton 1816; Milne and Chalmers 2014 ). Patients were allocated either to his care; or to the care of a surgeon colleague who, like him, did not use bloodletting; or to a surgeon colleague who did use bleeding.

It had been so arranged, that this number [366] was admitted, alternately, in such a manner that each of us had one third of the whole. The sick were indiscriminately received, and were attended as nearly as possible with the same care and accommodated with the same comforts. One third of the whole were soldiers of the 61st Regiment, the remainder of my own (the 42nd) Regiment. Neither Mr Anderson nor I ever once employed the lancet. He lost two, I four cases; whilst out of the other third [treated with bloodletting by the third surgeon] thirty five patients died (Lesassier Hamilton 1816).

In 1835, a Society of Truth-loving Men in Nürnberg reported its remarkable blinded comparison of homeopathic provings with ‘snow water’. Vials containing one or other of the two substances were shuffled prior to distribution for assessment (Löhner 1835; Stolberg 2006). A few years later, Thomas Graham Balfour, an army surgeon in charge of an orphanage, was explicit about his rationale for using alternate allocation in his assessment of claims that belladonna was protective against scarlet fever. He reported having used alternation to allocate children either to receive belladonna or to a comparison group ‘to avoid the imputation of selection’ (Balfour 1854; Chalmers and Toth 2009).

It seems reasonable to speculate that concern to compare like with like, and so to ‘avoid the imputation of selection’, explains the increasing use of alternate allocation to treatment comparison groups during the late 19th and early 20th centuries (in animals (Pasteur 1881) as well as in humans). Writers in several countries emphasised the need to compare like with like. These included, for example, Jules Gavarret in France (Gavarret 1840; Huth 2006a), Elisha Bartlett in the United States (Bartlett 1844; Huth 2006b), William Guy In Britain (Guy 1860), and Alfred Ephraim in Germany (1890-1894). A quotation from an 1877 Danish doctoral thesis on tracheotomy for diphtheria gives a flavour of the developing thinking about the grounds for causal inferences about the effects of treatments:

“If any surgeon with material as large as chief physician Holmer could really take the decision, as a test, to let every second croup patient (with an indication for tracheotomy) remain without the operation and every second undergo the operation, and it turned out that the proportion of unoperated [patients who] recovered was equal to or higher than those operated [on], then one could begin to doubt the value of tracheotomy… (Wanscher 1877). [emphasis added]

The James Lind Library currently contains well over 200 reports of the use of such potentially unbiased methods of prospective allocation in treatment comparisons published before 1948, when the Medical Research Council’s trial of streptomycin was published (MRC 1948). The earlier reports we have identified are listed here.

During the early decades of the 20th century, alternate allocation became increasingly common as a feature of research design, and was designated formally using specific terms in several languages. In 1902, in an article published in Muenchener Mediziner Wochenschrift referring to alternate allocation trials on treatments for plague in India, Dr G Polverini of the Institute of Experimental Pathology in Florence, deemed ‘die alternative Methode’ as the most appropriate ‘for assessing the healing power of a serum in humans’ (Polverini 1903). Six years later, one of the physicians responsible for the trials in India – Nasserwanji Hormusji Choksy – referred to the method they had been using as ‘the alternate case method’ and ‘rational alternation’ (Choksy 1908). In France at about the same time, Maurice Cousin (1905) and his thesis supervisor Arnold Netter (1906) referred to their use of ‘la méthode alternante’ in studies to assess ways of reducing serum sickness. In the United States, Jesse Bullowa (1928) and Russell Cecil and Norman Plummer (1930) referred to ‘alternation’ and to ’the alternate case method’, respectively, in connection with their trials to assess the effects of serum treatment in pneumonia. And in Austria, Julius Wagner-Jauregg decided to ‘baptise’ the method ‘Simultanmethode’ in German after applying it in studies using fever to treat syphilis (Wagner-Jauregg 1931).

It is worth noting that this designation of alternation as a methodological principle by clinician researchers antedated Ronald Fisher’s promotion of the theoretical statistical qualities of random allocation in The Design of Experiments (Fisher 1935).  Indeed, although there are examples of random allocation being used during the 1930s and early 1940s (see, for example, Doull et al. 1931; Theobald 1937 ; Bell 1941), use of the word ‘random’ to describe treatment allocation sometimes actually referred to alternation (Armitage 2002), even in the writings of Austin Bradford Hill, the statistician most closely associated with the adoption of randomization in Britain (Hill 1937; Chalmers 2005; 2010).

Where was alternate allocation used, in whom, and to test which interventions?

Pre-1948 alternate allocation trials were done across the world. To date, we have found examples in Algeria, Austria, Australia, Britain, Denmark, Egypt, Finland, France, Germany, India, Italy, Malaya, Netherlands, Sudan, the United States, and Vietnam. Among these, a few programmes of alternate allocation trials stand out. Those done in India by Waldemar Haffkine and Nasserwanji Hormusji Choksy at the turn of the century on vaccines and treatments for plague and cholera are early examples of separate studies done within a series of planned controlled trials (Ramanna, in press; Syed et al., in press; Chakrabarti, in press; Davey-Smith, in press). In the United States (and in New York and Boston in particular), Jesse Bullowa, William Park, Russell Cecil, Max Finland and others were responsible for a remarkable series of trials testing serum treatment for pneumonia during the third and fourth decades of the 20th century (Podolsky 2008). The only example of anything comparable in Britain appears to have been a cluster of trials done by Thomas Anderson and his colleagues at Ruchill Hospital in Glasgow in the late 1930s, to assess the effects of sulphonamides in a variety of infections (Bryder 2010).

Unsurprisingly, given the overwhelming importance of infectious diseases at the time, many alternate allocation trials were done to assess the effects of interventions to prevent or treat infections. The target infections included bacillary dysentery, cerebrospinal fever, cholera, the common cold, diphtheria, erysipelas, gonorrhœa, impetigo, infant diarrhoea, infectious hepatitis, influenza, malaria, mastitis, measles, meningococcal meningitis, plague, pneumonia, poliomyelitis, puerperal fever, scarlet fever, syphilis, tonsillitis, trichomoniasis, Tsutsugamushi disease, tuberculosis, typhoid fever, typhus, and whooping-cough. The interventions tested included antibiotics, antiseptics, diet, Eucalyptus oil, gamma globulin, physical therapies, proteins and amino acids, specific sera, sulphonamides and other drugs, ‘therapeutic malaria’, vaccines, and vitamins.

Alternate allocation trials were also used to assess the effects of nutritional and other interventions to promote health and growth: unpolished and polished rice for beri-beri; germinated beans compared with lemon juice for scurvy; vitamin B1 for polyneuritis in alcohol addicts; and vitamins, minerals, milk and ultraviolet light to promote child growth and development. In pregnancy and childbirth, alternate allocation was used in studies to assess the effects of micronutrients to prevent anaemia and toxaemia; salt for leg cramps; analgesics for pain in labour; perineal shaving and post partum care of the perineum; ergot alkaloids to reduce postpartum haemorrhage; treatments for acute mastitis and deficient lactation and for preventing sore nipples; and the effects of knee-chest position and postural exercises on postpartum uterine retroversion.

‘The alternate case method’ was also used to challenge claims that surgery was an effective treatment for psychosis, and to put some ‘old wives’ treatments’ to the test: a Dr Middleton in Edinburgh reported that he had alternated tannic acid with ‘strong tea of the lumberjack variety’ (Middleton 1936) for treating scalds in children, with results suggesting that the preferences of ‘old wives’ were as likely to be valid as those of medical experts.

More research is needed to increase understanding of the reasons for the explosion of alternate allocation studies from the 1890s onwards. One explanation may have been the gradual adoption of probabilistic, statistical thinking by some physicians (see, for example, Gavarret 1840; Bartlett 1844; Heiberg 1897; and Ephraim 1890-1894).  However, even Almroth Wright, who made a career out of dismissing the application of statistics to medicine in the early part of the 20th century, had started doing alternate allocation studies by the early 1910s (Wright et al. 1914).

What is clear is that, at least as early as the second decade of the 20th century, there were some very clear accounts of the principles that need to be observed when testing treatments. For example, in a paper entitled The crucial test of therapeutic evidence, which was based on an address given at the 1917 annual meeting of the American Medical Association, Torald Sollmann alluded to the unacceptability of biased under-reporting of commercial tests of drugs, and called for independent evaluations, using alternation to control allocation bias and blinding to reduce observer bias (Sollmann 1917). A study published by Adolf Bingel the following year provides a nice example of these two principles being applied in practice (Bingel 1918; Tröhler 2010b ; Opinel et al. 2011)

The gradual move from alternation to random allocation

It is clear that, contrary to a common assumption (Chalmers 2010), randomized trials did not suddenly fill a methodological vacuum beginning in 1948. Long before the concept of random allocation was introduced by statisticians, some doctors who wanted to compare preventive and therapeutic strategies recognised that comparison groups generated by alternate allocation would yield more credible evidence than comparison groups based on clinical decisions. There is some evidence of statistical expertise being brought to bear in a few of these early trials. For example, in 1912, a formal statistical test was applied to data from one of Choksy’s many plague studies (Advisory Committee 1912). And during the 1920s, Louis Dublin, an actuary at the Metropolitan Life Insurance Company, seems likely to have been influential in the design and analysis of a series of methodologically sophisticated alternate allocation studies done to evaluate the effects of serum therapy for pneumonia (Podolsky 2006; 2008).

So what led to the gradual move away from alternation to random allocation? The principal disadvantage of alternate allocation is that it usually means that those making decisions about who will participate in treatment comparisons have foreknowledge of upcoming allocations, and this sometimes leads them to undermine an allocation schedule that, in principle, should be unbiased.

In 1933, when assessing the reasons for baseline imbalances in a Medical Research Council trial of serum treatment for pneumonia (MRC 1934), Austin Bradford Hill learned how alternation could be subverted by those recruiting patients (Hill 1933).  A dozen years later, Bradford Hill was one of the three-man team designing the MRC’s randomized trial of streptomycin. One of the others was Philip D’Arcy Hart. In a trial that D’Arcy Hart had designed for the Medical Research Council in 1943, allocation had been by rotation to one of four groups – two antibiotic, and two placebo – with the specific purpose of preventing foreknowledge of treatment allocations (MRC 1944; Chalmers and Clarke 2004). Although one of the reasons that the streptomycin trial has become iconic is that the treatment allocation schedule was based on random number tables (MRC 1948) , this was not for any esoteric statistical reason (Doll 2002). It was because successful concealment of allocation schedules and prevention of foreknowledge of upcoming allocations among clinicians entering patients in trials is more likely to be achieved with allocation schedules based on random numbers than with schedules using alternation (Chalmers 2005; 2010). 

The need to fill gaps in the history of controlled trials

Over most of the past two decades, our identification of pre-1948 reports of controlled trials using potentially unbiased treatment allocation schedules has been ‘opportunistic’. More recently, we have been able to use full text digital searches of the British Medical Journal, the Lancet, the Journal of the American Medical Association, the New England Journal of Medicine and the Proceedings of the Royal Society of Medicine, from the inceptions of the journals to 1947. In addition, a hand search of the Indian Medical Gazette from 1890 to 1910 was prompted by some of the important information about trials done in India at the turn of the 20th century. The Table below provides a summary of our findings as they stand currently.

Pre-1948 reports of controlled trials using potentially unbiased treatment allocation schedules

Journal Pre-1900 1900-1929 1930-1939 1940-1947 Total
           
Total 26 55 82 77 240
           
BMJ 5 8 23 21 57
JAMA 2 18 13 16 49
Lancet 2 8 11 21 42
NEJM 1 0 6 0 7
Proc RSM 1 3 3 1 8
Elsewhere 15 18 26 18 77

 

The methods we have used to identify pre-1948 reports of controlled trials using potentially unbiased treatment allocation schedules are adequate to illustrate the use of this important element of trial design before the widespread adoption of randomization from the late 1940s onwards. However, the numbers in the Table are certainly minimum estimates of numerators, and they lack denominators to allow some estimate of the proportion of all articles on treatment evaluation which have had this feature of trial design. We invite readers to draw our attention to any other pre-1948 reports of trials using potentially unbiased treatment allocation schedules which are not currently included here.

Medical historians have not given adequate attention to the use of unbiased treatment allocation before random allocation began to be adopted more widely from the middle of the 20th century onwards. Some relevant material exists in doctoral theses of which we are aware, but most of this relates to developments in Britain www.jameslindlibrary.org/article-types/theses/). As is clear from the illustrative material we have assembled, developments were occurring concurrently in a number of countries, and being reported in a number of different languages. To avoid being parochial, research into this important era in the evolution of clinical trials requires knowledge in several languages, and international collaboration (Opinel et al. 2011).

We have provided some tantalizing examples of relevant material published in Danish, French and German. Research funders and researchers in the countries where these languages are used need to recognise how important it is that they contribute to the investigation of an era of fundamental importance in the international development of fair tests of treatments. We hope that our findings will prompt interest in and support for research to document and understand the efforts made to develop reliable tests of treatments in a number of countries during the first half of the 20th century.

Acknowledgements

We dedicate this article to the memory of Harry Marks, a generous adviser to the James Lind Library, and a leading and inspiring historian of the development of the randomized clinical trial, who died in 2011. We thank Ulrich Tröhler and Christian Gluud for translating material published in French, German and Danish; Rosie Wild and Jane Ferrie for independent hand searches of the Indian Medical Gazette;   Patricia Atkinson, Rebecca Brice, and Olivia Clarke for clerical help; and Doug Altman, Mike Clarke, Christian Gluud, Iain Milne and Ulrich Tröhler for helpful comments on earlier drafts.

This James Lind Library commentary has been republished in the Journal of the Royal Society of Medicine 2011;105:221-227. Print PDF

References

Advisory Committee on Plague Investigations in India (1912). The serum treatment of human plague. Journal of Hygiene. Plague Supplement II, LVI:326-39.

al-Razi (10th century CE; 4th Century AH). Kitab al-Hawi fi al-tibb. [The comprehensive book of medicine]

Armitage P (2002). Randomisation and alternation: a note on Diehl et al. JLL Bulletin: commentaries on the history of treatment evaluation (www.jameslindlibrary.org).

Balfour TG (1854) . Quoted in West C. Lectures on the Diseases of Infancy and Childhood. London, Longman, Brown, Green and Longmans, 1854, p 600.

Bartlett E (1844). An essay on the philosophy of medical science. Philadelphia: Lea and Blanchard.

Bell JG (1941). Pertussis prophylaxis with two doses of alum-precipitated vaccine. Public Health Reports 56:1535-1546.

Bingel A (1918). Über Behandlung der Diphtherie mit gewöhnlichem Pferdeserum. Deutsches Archiv für Klinische Medizin 125:284-332.

Bryder L (2010). The Medical Research Council and clinical trial methodologies before the 1940s: the failure to develop a ‘scientific’ approach. JLL Bulletin: Commentaries on the history of treatment evaluation (www.jameslindlibrary.org).

Bullowa JGM (1928). The use of antipneumococcic refined serum in lobar pneumonia: data necessary for a comparison between cases treated with serum and cases not so treated, and the importance of a significant control series of cases. JAMA 90:1354-1358.

Cecil RL, Plummer N (1930). Pneumococcus Type I pneumonia – a study of eleven hundred and sixty-one cases, with especial reference to specific therapy. JAMA 95:1547-1553.

Chakrabarti P (2011). Commentary: An experimental theatre for vaccines: Bombay in the time of plague. Int J Epidemiol.

Chalmers I (2005). Statistical theory was not the reason that randomisation was used in the British Medical Research Council’s clinical trial of streptomycin for pulmonary tuberculosis. In: Jorland G, Opinel A, Weisz G, eds. Body counts: medical quantification in historical and sociological perspectives. Montreal: McGill-Queens University Press, p 309-334.

Chalmers I (2010). Why the 1948 MRC trial of streptomycin used treatment allocation based on random numbers. JLL Bulletin: Commentaries on the history of treatment evaluation (www.jameslindlibrary.org).

Chalmers I, Clarke M (2004). The 1944 Patulin Trial: the first properly controlled multicentre trial conducted under the aegis of the British Medical Research Council. International Journal of Epidemiology 32:253-260.

Chalmers I, Toth B (2009). 19th century controlled trials to test whether belladonna prevents scarlet fever. JLL Bulletin: Commentaries on the history of treatment evaluation (www.jameslindlibrary.org).

Chalmers I (2010). Why the 1948 MRC trial of streptomycin used treatment allocation based on random numbers. JLL Bulletin: Commentaries on the history of treatment evaluation (www.jameslindlibrary.org).

Choksy KBNH (1908). On recent progress in serum-therapy of plague. BMJ 1:1282-1284.

Cousin M (1905). Des éruptions consécutives aux injections de sérum antidiphthérique et de leur traitement prophylactique par l’ingestion de clorure de calcium. Thèse pour le Doctorat en médicine. Paris: Jules Rousset, p 36-44.

Cox DR (2009). Randomization for concealment. JLL Bulletin: Commentaries on the history of treatment evaluation (www.jameslindlibrary.org).

Davey-Smith G (2011). Int J Epidemiol.

Doll R (2002). The role of data monitoring committees. In: Duley L, Farrell B, eds. Clinical Trials. London: BMJ Books, p 97-104.

Doull JA, Hardy M, Clark JH, Herman NB (1931). The effect of irradiation with ultra-violet light on the frequency of attacks of upper respiratory disease (common colds). American Journal of Hygiene13:460-477.

Ephraim A (1890-1894). Uber die Bedeutung de statistischen Methode für die Medicin. [On the relevance of the statistical method for medicine] Volkmann’s Sammlung Klinische Vortraege N.F. Innere Medicin 24:706-716. Leipzig: Breitkopf and Härtel.

Fisher RA (1935). The design of experiments. Edinburgh: Oliver and Boyd.

Gavarret LDJ (1840). Principes généraux de statistique médicale: ou développement des règles qui doivent présider à son emploi. Paris: Bechet jeune & Labé.

Guy WA (1860). Croonian Lectures on the numerical method, and its application to the science and art of medicine. BMJ 2:553-555.

Heiberg P (1897). Studier over den statistiske undersøgelsesmetode som hjælpemiddel ved terapeutiske undersøgelser [Studies on the statistical study design as an aid in therapeutic trials]. Bibliotek for Læger 89:1-40.

Hill AB (1933). Serum treatment of pneumonia, 22 December 1933, cited in Austoker J, Bryder L, eds. Historical perspectives on the role of the MRC. Oxford: Oxford University Press 1989, p 46-47.

Hill AB (1937).  Principles of medical statistics. London: Lancet, 1937

Huth EJ (2006a). Jules Gavarret’s Principes Généraux de Statistique Médicale: a pioneering text on the statistical analysis of the results of treatments. JLL Bulletin: Commentaries on the history of treatment evaluation (www.jameslindlibrary.org).

Huth EJ (2006b). Transatlantic ideas on the philosophy of therapeutics in the middle of the 19th century. JLL Bulletin: Commentaries on the history of treatment evaluation (www.jameslindlibrary.org).

Kaptchuk TJ (2011). A brief history of the evolution of methods to control of observer biases in tests of treatments. JLL Bulletin: Commentaries on the history of treatment evaluation (www.jameslindlibrary.org).

Lesassier Hamilton A (1816). Dissertatio Medica Inauguralis De Synocho Castrensi (Inaugural medical dissertation on camp fever). Edinburgh: J Ballantyne.

Lind J (1753). A treatise of the scurvy. In three parts. Containing an inquiry into the nature, causes and cure, of that disease. Together with a critical and chronological view of what has been published on the subject. Edinburgh: Printed by Sands, Murray and Cochran for A Kincaid and A Donaldson.

Löhner G, on behalf of a society of truth-loving men (1835). Die homöopathischen Kochsalzversuche zu Nürnberg [The homeopathic salt trials in Nuremberg]. Nuremberg.

Massey I (1723). A short and plain account of inoculation. With some remarks on the main argument made use of to recommend that practice, by Mr. Maitland and others. To which is added, a letter to the learned James Jurin, M.D.R.S. Secr. Col. Reg. Med. Lond. Soc. London: W. Meadows.

Medical Research Council Therapeutic Trials Committee (1934). The serum treatment of lobar pneumonia. BMJ 1:241-245.

Medical Research Council (1944). Clinical trial of patulin in the common cold. Lancet 2:373-375.

Medical Research Council (1948). Streptomycin treatment of pulmonary tuberculosis. BMJ 2:769-782.

Mesmer FA (1781). Précis historique des faits relatifs au magnétisme animal jusques en avril 1781. Par M. Mesmer, Docteur en Médecine de la Faculté de Vienne. Ouvrage traduit de l’Allemand. [Historical account of facts relating to animal magnetism up to April 1781. By M. Mesmer, Doctor in Medicine of the Vienna Faculty. Work translated from German] A Londres [false imprint, probably Paris.] p. 111-114; 182.

Middleton DS (1936). Tea for burns or scalds in the home. BMJ 1:555

Milne I, Chalmers I(2014). Alexander Lesassier Hamilton’s 1816 report of a controlled trial of bloodletting. JLL Bulletin: Commentaries on the history of treatment evaluation (www.jameslindlibrary.org).

Netter A (1906). Efficacité de l’ingestion de chlorure de calcium comme moyen préventif des éruptions consecutives aux injections de sérum. Séances et Mémoires de la Société de Biologie 1906;58:279-280.

Opinel A, Tröhler U, Gluud C, Gachelin G, Davey Smith G, Podolsky SH, Chalmers I (2011). The evolution of methods to assess the effects of treatments, illustrated by the development of treatments for diphtheria, 1825-1918. Int J Epidemiol. doi :10.1093/ije/dyr162.

Pasteur L (1881). Compte rendu sommaire des expériences rates á Pouilly-le-Fort, prés Melun, sur la vaccination charbonneuse. Comptes rendus de l’Académie des Sciences 92: 1378-1383.

Petrarch F (1364). Rerum Senilium Libri. Liber XIV, Epistola 1. Letter to Boccaccio (V.3). [Letters of old age.].

Podolsky SH (2006). Pneumonia before antibiotics: therapeutic evolution and evaluation in twentieth-century America. Baltimore: Johns Hopkins University Press.

Podolsky SH (2008). Jesse Bullowa, specific treatment for pneumonia, and the development of the controlled clinical trial. JLL Bulletin: Commentaries on the history of treatment evaluation (www.jameslindlibrary.org).

Polverini G (1903). Serumtherapie gegen Beulenpest [Serum treatment of bubonic plague]. Muenchener Med. Wochenschrift 50:649-651.

Ramanna M (2011). Commentary: NH Choksy and serum therapy. Int J Epidemiol

Silverman WA, Chalmers I (2002). Casting and drawing lots: a time-honoured way of dealing with uncertainty and for ensuring fairness. JLL Bulletin: Commentaries on the history of treatment evaluation (www.jameslindlibrary.org).

Sollmann T (1917). The crucial test  of therapeutic evidence. JAMA 69:198-199.

Starkey G (1657). Natures explication and Helmont’s vindication…Or A short and sure way to a long and sound life. London: printed by E. Cotes for Thomas Alsop at the two Sugar-loaves over against St. Antholins Church at the lower end of Watling Street.

Stolberg M (2006). Inventing the randomized double-blind trial: The Nuremberg salt test of 1835. JLL Bulletin: Commentaries on the history of treatment evaluation (www.jameslindlibrary.org).

Syed IS, Swaminathan KL (2011). Commentary: Dr. Choksy’s dilemma. Int J Epidemiol. doi:10.1093/ije/dyr168.

Theobald GW (1937). Effect of calcium and vitamin A and D on incidence of pregnancy toxaemia. Lancet 2:1397-1399.

Tibi S (2005). Al-Razi and Islamic medicine in the 9th Century. JLL Bulletin: Commentaries on the history of treatment evaluation (www.jameslindlibrary.org).

Tröhler U (2010a). The introduction of numerical methods to assess the effects of medical interventions during the 18th century: a brief history. JLL Bulletin: Commentaries on the history of treatment evaluation (www.jameslindlibrary.org).

Tröhler U (2010b). Adolf Bingel’s blinded, controlled comparison of different anti-diphtheritic sera in 1918. JLL Bulletin: Commentaries on the history of treatment evaluation (www.jameslindlibrary.org).

Van Helmont JB (1648). Ortus medicinæ: Id est Initia physicæ inaudita. Progressus medicinae novus, in morborum ultionem, ad vitam longam. [The dawn of medicine: That is, the beginning of a new Physic. A new advance in medicine, a victory over disease, to [promote] a long life Amsterodami: Apud Ludovicum Elzevirium. p 526-527.

Wagner-Jauregg J (1931). Ueber die Infektionsbehandlung der progressiven Paralyse [On infection treatment of progressive paralysis]. Münchener Medizinische Wochenschrift 1931;78:4-7.

Wanscher O (1877). Om Diphteritis og Croup – særligt med hensyn til Tracheostomien ved samme. [On diphtheria and croup – especially regarding tracheostomy in this condition]. Disputats [Thesis]. Jacob Lund: Kjøbenhavn, p 67-68.

Wright AE, Morgan WP, Colebrook L, Dodgson RW (1914). Observations on prophylactic inoculation against pneumococcus infections, And on the results which have been achieved by it. Lancet 1:87-95