Matthews RAJ (2025). The problematic history of randomised controlled trials Part 1: presumption and confusion on the road to randomisation.

© Robert A J Matthews, Department of Mathematics, Aston University, Birmingham B4 7ET UK. Email: rajm@physics.org


Cite as: Matthews RAJ (2025). The problematic history of randomised controlled trials Part 1: presumption and confusion on the road to randomisation. JLL Bulletin: Commentaries on the history of treatment evaluation (https://www.jameslindlibrary.org/articles/the-problematic-history-of-randomised-controlled-trials-part-1-presumption-and-confusion-on-the-road-to-randomisation/)


Randomisation is one of many statistical ideas that have permeated medical research but are imperfectly understood (Altman 1991)

Introduction

The emergence of the randomised controlled trial (RCT) ranks among the most important developments of 20th century medicine. Often described as the “gold standard” for assessing therapies, its exalted status has its origins in the UK Medical Research Council (MRC) clinical trial of the efficacy of streptomycin for treating pulmonary tuberculosis (Medical Research Council 1948) (Bynum 2008). Published in 1948, the trial featured a method of randomly allocating patients to the treatment and control arms designed by the statistician Austin Bradford Hill of the London School of Hygiene and Tropical Medicine (LSHTM). The aim was to achieve a fair comparison of the two groups free of biased allocation, unconscious or otherwise. While there had been sporadic use of random allocation previously (eg Chalmers 2005), the MRC streptomycin trial is generally regarded as marking the emergence of the modern clinical trial. It also heralded the decline of the long-standing practice of alternation, in which successive patients are assigned to each trial arm in strict alternating order.

While seemingly simple and straightforward, this change in practice raises several questions. Perhaps the most obvious is why alternation had become the de facto method of allocation in the first place. Given that random processes such as the casting of lots have been used to achieve fair distributions for thousands of years (Silverman & Chalmers 2001), it would seem the natural choice for allocating patients when the concept of fair clinical trials began to flourish in the 19th century. The standard narrative of the ascendancy of RCTs (eg Booth 1993; Matthews 1995) gives no clear explanation for this. Nor does it explain why alternation was still being defended long after the MRC trial, including by Hill himself.

Similar questions surround the ultimate ascendancy of random allocation. Until recently, the standard narrative ascribed this to Hill’s embrace of the work of the statistician Ronald Fisher (eg Booth 1993; Matthews 1995, Stigler 2016), who developed the theory of randomisation – which underpins the statistical consequences of random allocation – while working at the Rothamsted agricultural research institute in the 1920s. However, evidence from a variety of sources, including Hill himself, shows that Fisher’s theory played no substantive role in his advocacy of random allocation (Yoshioka 1998, Chalmers 1999, Chalmers 2013, Neuhauser et al 2020). Hill’s real motivation is now held to have been entirely pragmatic: countering the risk of clinicians biassing the allocation of patients to the arms of a trial, inadvertently or otherwise. The systematic nature of alternation makes it vulnerable to such bias, an issue Hill himself investigated in the early 1930s (Chalmers 1999, Chalmers 2013). Importantly, he also recognised that the unpredictability of a random allocation sequence was necessary but not sufficient for preventing allocation bias, leading him to insist the random sequence also be concealed.

This still leaves unanswered the question of why Fisher’s seminal work had so little influence on Hill’s advocacy of random allocation. By the time the MRC streptomycin trial was underway, Fisher’s landmark texts Statistical Methods for Research Workers (1925) and The Design of Experiments (1935) were in their 10th and 4th editions respectively, and their influence had spread beyond their origins in crop experiments to many other areas, including clinical research. Hill was certainly aware of Fisher’s concept of randomisation and its implications for both the reliability and validity of inferences from experimental studies. Yet he remained convinced the value of randomisation in clinical trials lay in its ability to prevent allocation bias (Armitage 2003; Chalmers 2005, Chalmers 2011) rather than “any esoteric statistical reason” (Doll 2002).

This three-part study examines the standard narrative of the ascendancy of the RCT with the aim of finding resolutions of these questions. Taken together, they indicate that the current narrative underplays the role of Hill’s statistical education in his approach to inferential issues in general and Fisher’s concept of randomisation in particular. This led him to underestimate the relevance and potency of randomisation in clinical research. Moreover, Hill’s views were reflected in his hugely influential text Principles of Medical Statistics, first published in 1937 (Hill 1937) and still in print over half a century later. It is argued that the consequences can be seen in attitudes towards randomisation and its value in the design and analysis of clinical trials to this day.

The curiously convoluted road to random allocation

The allocation of patients is part of one of the oldest narratives in the history of medicine, centred on the quest for reliable assessment of therapies (Chalmers et al 2012). References to random processes being used to make unbiased selections date back to 9th century BCE Mesopotamia, where new rulers were identified by the casting of lots (Oppenheim 1977 pp 99-100). This process is mentioned many times in the Bible, with the Talmud even including methods for preventing cheating (Franklin 2001). An apparent proposal for the use of random allocation to resolve a clinical question appears in the writings of the 17th century Flemish physician Jan Baptiste van Helmont, who suggests drawing lots to form two groups of patients for a comparison of the effectiveness of bloodletting and purging (Silverman & Chalmers 2001). Exactly how or why the allocation was to be performed is disputed; the comparison may never have been carried out in any case (Donaldson 2016).

The oldest report currently known of randomised allocation in a clinical trial is from 1835, in a study of the reality or otherwise of homeopathic effects (Kaptchuk 1998). Led by a group of sceptical physicians in Nuremberg, Bavaria, it was designed to investigate the detectability rather than the efficacy of such effects. To counter accusations of bias, the vials of homeopathic and neutral solutions were numbered and thoroughly shuffled prior to being distributed to dozens of trial participants, with both they and the triallists being blinded to the allocation. As such, the trial design is strikingly modern, though as Stolberg (2006) points out, it was still vulnerable to subversion by dishonest reporting of effects by trial participants.

Around this time randomness was being discussed in another medical context: the use of “scientific” methods for assessing the outcome of clinical trials. The idea was to compare the apparent efficacy of treatment with what one would expect if mere chance were the true cause. Such a comparison could then be made quantitative through the use of probability theory. If the observed effect was sufficiently greater than that expected on the basis of chance alone, the trial outcome might be deemed worth taking seriously. This at least was the idea; 19th century critics of the proposal rightly recognised that it is replete with assumptions and caveats, prompting responses from the clinical community ranging from mute incomprehension to vocal outrage (Matthews 1995, Tröhler 2020). Some objected to what they regarded as the subversion of professional judgement by probabilistic reasoning. Others saw the use of mathematics as a form of charlatanism intended to portray medicine as an exact science. There was also concern that “strangers at the bedside” wielding statistical methods would undermine the relationship between the physician and the individual patient (Matthews 1995 ch2).

Such concern may have played a role in the otherwise puzzling ascendancy of alternation rather than random allocation as a means of achieving fair trials. By the mid-1800s, accounts of clinical trials using alternation were emerging whose authors seemed keen to point out not only that this approach was fair in principle but also visibly fair to patients (Chalmers et al 2012). This may have been perceived as better aligned with the traditional duty of care than random allocation, in which patients were allocated like cards drawn from a shuffled deck. It is notable that Hill himself felt it necessary to address such qualms over a century later as random allocation became more widely used (Hill 1963); the issue remains controversial (Nardini 2014).

This focus on the treatment of the individual patient may have blinded proponents of alternation to the fact that while it could result in a fair comparison, this was not guaranteed. Alternation was clearly vulnerable to manipulation by physicians motivated by their perceived duty of care to specific patients. It also offered no protection against unrecognised sources of bias undermining the reliability of the trial outcome. Hill’s great achievement was to show that concealed random allocation provides a simple remedy to the first challenge. Fisher’s great achievement was to show randomisation can also overcome the second challenge by accommodating the effect even of unknown biases. Simply put, it achieves this by converting them into the same probabilistic form as the play of chance. In the case of a clinical trial, random allocation scatters the biases and confounders across the trial arms with equal probability, while scrambling any correlations lurking in the patient series. Random allocation thus replaces the questionable security offered by alternation with a probability distribution guaranteeing a statistically valid estimate of both the treatment effect and the accompanying uncertainty. As Fisher put it:

“Randomisation properly carried out…relieves the experimenter from the anxiety of considering and estimating the magnitude of the innumerable causes by which his data may be disturbed.” (Fisher 1935 p44)

Crucially for the thesis explored here, however, Fisher’s theory also showed that random allocation brought other remarkable benefits, including robust inferences even from very small trials. As such, random allocation offers triallists what Fisher called “The first requirement which governs all well-planned experiments” (Fisher 1925 p 224): the ability to extract reliable insights by statistically valid methods.

Given this, it is remarkable that the standard narrative of the rise of RCTs focuses on the one consequence of random allocation identified by Hill: protection against allocation bias. This is clearly a major advantage, and Hill’s recognition of this is rightly celebrated. Yet it raises the question of why it took so long for triallists to (re-)discover the value of the ancient method of random selection in their quest for reliable assessments of therapies. The answer may lie in a simple, long-standing but unsafe presumption concerning the way in which patients become participants in a clinical trial.

The Quasirandomness Fallacy

By the late 19th century, anecdotal reports and case series were increasingly being supplanted by clinical trials in which patients were assigned to treatment arms using alternation (Opinel et al 2011). The paradigmatic example of this transition is the clinical trial of a treatment for diphtheria published in 1898 by the Danish physician Johannes Fibiger (Fibiger 1898). Studies based on historical controls had led to the adoption of so-called serum therapy, which seemingly cut fatality rates in half. Sceptics remained unconvinced, not least because so dramatic a benefit was not found in studies involving patients with similar ages and symptoms (Gluud 2011). Both Fibiger and Søren Sørensen, his professor at Blegdamshospitalet in Copenhagen, recognised the risk that subjective assessment of patients by clinicians was leading to allocation bias, and thus unreliable trial findings. Fibiger declared that it could “hardly be doubted” that such bias was present in the earlier studies (Hróbjartsson et al 1998), and was equally clear about the solution:

“In many cases a trustworthy verdict can only be reached when a large number of randomly selected patients are treated with the new remedy, and, at the same time, an equally large number of randomly selected patients are treated as usual”. (Fibiger 1898; emphasis added)

This has led to Fibiger’s trial being described as “(T)he first clinical trial in which random allocation was used and emphasised as a pivotal methodological principle.” (Hróbjartsson et al 1998). Given that Fibiger fails to explain why the allocation should be deemed random such status is questionable; it vanishes with Fibiger’s description of how patient allocation was to be performed:

“The only method which could be used rationally was to treat every second patient with serum [and] every other as usual.” (Fibiger 1898; emphasis added)

Practical considerations led Fibiger to use a form of cluster allocation in which all patients admitted on alternate days received either the serum or standard care. Nevertheless, this is clearly allocation by alternation, not random allocation. How had Fibiger come to confuse a rigidly regular process with its exact opposite? The conflation is so clear it suggests Fibiger did not believe there was any contradiction and instead saw alternation as somehow consistent with the notion of “randomly selected patients”. Given that strict alternation per se is obviously not a randomising process and no mention of one appears in his otherwise detailed report, it seems Fibiger considered patients presenting for potential inclusion as already randomised. An explanation for this may lie in Fibiger’s reference to the work of Povl Heiberg, a Danish physician and author of an 1897 paper on the use of statistical methods in clinical trials (Heiberg 1897; Gluud and Hilden 2009). Heiberg advocated both the use of such methods and of allocation by alternation, despite the former assuming that patients in each arm constitute random samples, which alternation per se cannot justify. Fibiger states that Heiberg regarded alternation as “statistically correct” (Fibiger 1898), which suggests he used alternation without recognising the presumption of random sampling underlying Heiberg’s statement.

As Hróbjartsson and colleagues note, while Fibiger’s trial shows no obvious sign of allocation bias, his apparent belief that it would be eliminated by alternation was clearly misplaced. This has led to his trial being characterised as “quasirandomised” (Hróbjartsson et al 1998; Gluud and Hilden 2009). While the term eludes definition in this context (Schulz and Grimes 2002) it usefully allows the misconception at the heart of alternation to be characterised as the Quasirandomness Fallacy: the presumption that patient presentation is a “quasirandom” process resulting from happenstance, with no formal randomising process being needed to allow valid and reliable statistical inference based on standard sampling theory. Given that patients become available for a host of reasons, the notion that they are “quasirandomised” may seem acceptable. Genuine randomness is, however, a remarkably subtle phenomenon (Bennett 1998; Knuth 1997), and its role in statistical inference requires characteristics that cannot be assumed to be generated simply via happenstance. In the case of clinical trials, statisticians have long warned against mistaking mere haphazardness for true randomness (Mainland 1948 & 1963; Altman 1991; Altman & Bland 1999) or presuming its presence prior to treatment allocation. As Cox explains:

 “It is sometimes argued that, e.g. the allocation of patients to one of a number of possible treatment regimens may be determined by so many ill-specified considerations that observational data may be analyzed as if treatment allocation were essentially random; in particular, it is implied that actual randomization is not needed. While this may be the case in specific instances, such reliance on effective randomization arising from ill-specified procedures is likely to be hazardous”. (Cox 2009 p 416)

The presence of the Quasirandomness Fallacy can be seen in the odd prevalence, noted by previous authors, of self-contradicting phrases such as “patients were randomly allocated by alternation” found in historical trial reports (Armitage 2002; Chalmers 2013; Farewell and Johnson 2017). These are consistent with alternation being seen as akin to dealing cards from an already well-shuffled pack, the outcome being both fair and random but only if no attempt has been made to “rig the deal”. A case in point is the well-known cold vaccine trial published in 1938 by Diehl and colleagues (Diehl et al 1938). Like Fibiger’s trial, this has been cited as a pioneering example of a randomised clinical trial (Lilienfeld 1982) as the trial participants were said to have been “assigned at random and without selection”. However, Diehl later stated without obvious concern that alternation had in fact been used (Waller 1997). This admission robs the trial of its historical significance, but – as with Fibiger’s trial – this becomes a mere technicality if one believes the original patient series was already random and that the allocation of the patients by alternation wasn’t “rigged”. Such conflations can still be found in contemporary accounts of RCTs (Wessely 2017). For the purposes of this study, however, the most important examples appear in the work of the statistician who most famously advocated for random allocation.

Hill’s Principles of Medical Statistics

In 1936, The Lancet asked Hill to write a series of weekly articles on medical statistics in response to “the steadily increasing demand among both clinical and public health workers” for an understanding of statistical methods (Editorial, Lancet 1937). Hill appears to have been chosen because of his experience of both applying and teaching statistical methods at the LSHTM. The resulting series was so popular it was rapidly re-published in what became the hugely influential textbook Principles of Medical Statistics (Hill 1937), the first edition appearing in June 1937. In Chapter 1 (“The Aim of the Statistical Method”), Hill describes the challenges of designing a “carefully planned experiment”, specifically the need to consider all the possible influences that might affect the outcome other than the difference in treatment. After pointing out that statisticians may be able to advise on a suitable design, Hill goes to the heart of the challenge:

“We can never be certain that we have not overlooked some relevant factor or that some factor is not present that could not be foreseen or identified”. (Hill 1937 p5)

This would seem the ideal opportunity for Hill to introduce his readers to Fisher’s concept of randomisation, which resolves precisely this challenge. Yet Hill embarks instead on a defence of alternation, which he argues is “often satisfactory” because

“…no conscious or unconscious bias can enter in, as it may in any selection of cases, and because in the long run we can fairly rely upon this random allotment of the patients to equalise in the two groups the distribution of other characteristics that may be important”. (Hill 1937 p5)

This statement is remarkable for several reasons. Most obviously, Hill has conflated alternation with “random allotment”. This is surprising given that on the previous page Hill had cited Fisher’s The Design of Experiments, which clearly differentiates between the two, warns of the deficiencies of methods like alternation and includes the celebrated remark (see above) about the ability of randomisation to resolve the very challenge Hill describes. Yet Hill makes no mention of any of this, his reference to Fisher’s textbook being to point out that “Elaborate experiments can be planned in which quite a number of factors can be taken into account statistically at the same time”. He then states that he will not discuss further “these more difficult methods” – now known as factorial designs. Nor, however, does Hill go on to explain the importance and benefits of random allocation even in simple clinical trial designs.

Most perplexing of all is Hill’s claim that alternation is invulnerable to “conscious or unconscious bias”, given that he already knew this was untrue. In 1933, he had been asked by the MRC’s Therapeutic Trials Committee to review a draft report of a multi-centre trial of a serum treatment for lobar pneumonia, in which alternation had been specified. His involvement followed evidence that triallists had deliberately allocated older patients to the control groups, to avoid potentially wasting the serum (Edwards 2004). Hill’s report, dated 22 December 1933, was deemed so damning that the Secretary of the MRC barred its release even to the triallists (Edwards 2004). It has since been lost, but examination of its contents before its disappearance indicate that Hill argued that greater effort should have been taken to ensure “that the division of cases really did ensure a random selection.” (Bryder 2011). If so, this is yet another conflation of random allocation with alternation. More importantly, it shows that – contrary to his statement in Principles four years later – Hill knew the latter was vulnerable to subversion with potentially disastrous consequences. Oddly, this does not deter him from using this same serum trial in Principles to illustrate how “random allocation” works in practice. Hill goes on to conclude the first chapter of Principles with the clearest example of conflation of all:

“If the series of cases is large, a random allocation of individuals – eg by cases being alternately placed in the treated and untreated groups – may reasonably be relied on to equalise the two groups in all characteristics except the one under examination” (Hill 1937 p8; emphasis added).

It has been argued that the pneumonia serum trial debacle played a crucial role in convincing Hill of the need for concealed random allocation (Chalmers 1999; Chalmers 2005; Chalmers 2013). If so, it was a damascene conversion he chose not to share with his readers for many years. One could argue Hill was simply abiding by the ban on discussing his confidential report on the trial; even so, this does not explain why he failed to give even a fictionalised example of the vulnerability of alternation to manipulation. Not until the sixth edition of Principles, published in 1955 (Hill 1955) – almost a decade after planning the streptomycin RCT and over two decades after his damning serum trial report – does Hill’s advocacy of alternation disappear from the text. Instead, the 1955 edition includes an additional chapter devoted to “the special problems of clinical trials”, with concealed random allocation put forward as a means of countering clinician-induced allocation bias. Even then, however, Hill insisted that alternation had been successfully used in many trials, its only failing being that it “may, however, be insufficiently random” because of interventions by triallists.

Hill remained strangely ambivalent about the advantages of random allocation for the rest of his life. Several explanations for this have been offered, including by Hill himself; their plausibility and implications are discussed in the second part of this study.

The demise of alternation

The standard narrative of the rise of RCTs typically portrays the MRC’s 1948 trial as a turning point in the design of clinical trials. Yet its impact was less than decisive: despite its many inadequacies, alternation continued to be used in a significant proportion of controlled trials long after the MRC streptomycin trial. A study of allocation methods used in perinatal trials in 57 different countries from 1950 to the mid-1980s found that “quasirandom” methods such as alternation remained the principal approach until the early 1960s and still accounted for nearly a quarter of trials by the late 1970s (Chalmers et al 1986). These proportions are probably lower bounds, given the persistent confusion over what constitutes genuinely random allocation (Chalmers, 2024 personal communication).

Alternation may no longer hold centre-stage in the design of clinical trials, but the full implications of Fisher’s theory of randomisation continue to be overlooked and misunderstood. In the second part of this study, Hill’s celebrated advocacy of random allocation is examined in the context of his own understanding of Fisher’s theory, a factor that has been largely ignored by the standard narrative.

Declarations

Competing Interests: None declared.

Funding: None declared.

Ethics approval: Not applicable.

Guarantor: RAJM.

Contributorship: Sole authorship.

Provenance: Invited article from the James Lind Library.

Acknowledgements: I thank Iain Chalmers for encouraging me to re-examine the standard narrative of the emergence of RCTs. His detailed feedback on early drafts, along with that from Vern Farewell, Stephen Senn, Scott Podolsky, Fabio Molo and Denise Best, is gratefully acknowledged.

For more information on allocation bias, see this entry in the Catalogue of Bias: Catalogue of Bias Collaboration. Spencer EA, Heneghan C, Nunan D. Allocation bias. In: Catalogue of Bias 2017. LINK

Link to the second part of this brief history – Matthews RAJ (2025). The problematic history of randomised controlled trials Part 2: Hill’s “pragmatic” view of randomisation and its origins

References

Altman DG (1991). Randomisation. BMJ 302: 1481-1482.

Altman DG, Bland JM (1999). BMJ 318: 1209.

Armitage P (2002). Randomisation and alternation: a note on Diehl et al. JLL Bulleting: Commentaries on the history of treatment evaluation. Available from www.jameslindlibrary.org/articles/randomisation-and-alternation-a-note-on-diehl-et-al/ (accessed on 17 December 2025).

Armitage P (2003). Fisher, Bradford Hill, and randomization. International Journal of Epidemiology 32: 925–928.

Bennett DJ (1998). Randomness Cambridge: Harvard University Press, chapter 9.

Bryder L (2011). The Medical Research Council and clinical trial methodologies before the 1940s: the failure to develop a ‘scientific’ approach. Journal of the Royal Society of Medicine 104: 335-343.

Bynum WF (2008). The history of medicine: a very short introduction. Oxford: Oxford University Press, p.150.

Booth CC (1993). Clinical research. In: Bynum WF and Porter R (eds) Companion encyclopaedia of the history of medicine. Vol 1. London: Routledge, p.223.

Chalmers I (1999). Why transition from alternation to randomisation in clinical trials was made. BMJ 319: 1372.

Chalmers I (2005). Statistical theory was not the reason that randomization was used in the British Medical Research Council’s Clinical Trial of streptomycin for pulmonary tuberculosis. In: Jorland G, Weisz G, Opinel A (editors) Body counts: medical quantification in historical and sociological perspective. Montreal: McGill-Queen’s Press, pp 309-334.

Chalmers I (2011). Why the 1948 MRC trial of streptomycin used treatment allocation based on random numbers. Journal of the Royal Society of Medicine 104: 383–386.

Chalmers I (2013). UK Medical Research Council and multicentre clinical trials: from a damning report to international recognition. Journal of the Royal Society of Medicine 106: 498–509.

Chalmers I (2024). Personal communication Oxford 9 February 2024

Chalmers I, Dukan E, Podolsky S, Davey Smith G (2012). The advent of fair treatment allocation schedules in clinical trials during the 19th and early 20th centuries. Journal of the Royal Society of Medicine 105: 221–227.

Chalmers I, Hetherington J, Newdick M, Mutch L, Grant A, Enkin M, et al (1986). The Oxford database of perinatal trials: developing a register of published reports of controlled trials. Controlled Clinical Trials 7: 306-324.

Cox DR (2009). Randomization in the Design of Experiments. International Statistical Review 77: 415–429.

Diehl HS, Baker AB, Cowan DW (1938). Cold vaccines: an evaluation based on a controlled study. JAMA 111: 1168–1173.

Doll R (2002). The role of data monitoring committees. In: Duley L, Farrell B (editors) Clinical Trials. London: BMJ Books, p97.

Donaldson I (2016). Van Helmont’s proposal for a randomised comparison of treating fevers with or without bloodletting and purging. Journal of the Royal College of Physicians of Edinburgh 46: 206–213.

Editorial (unsigned) (1937). Mathematics and Medicine Lancet 229: 31.

Edwards MV (2004). Control and the therapeutic trial 1918-1948. MD Thesis, University of London, UK, pp68-74.

Farewell V, Johnson T (2017). Major Greenwood and clinical trials. Journal of the Royal Society of Medicine 110: 452–457.

Fibiger J (1898). Om serumbehandling af difteri. Hospitalstidende 6: 309­325, 337–350.

Fisher RA (1925). Statistical methods for research workers Edinburgh: Oliver & Boyd.

Fisher RA (1935). The design of experiments Edinburgh: Oliver & Boyd.

Franklin J (2001). The science of conjecture: evidence and probability before Pascal. Baltimore: Johns Hopkins, p.283

Gluud C (2011). Danish contributions to the evaluation of serum therapy for diphtheria in the 1890s. Journal of the Royal Society of Medicine 104: 219–222.

Gluud C, Hilden J (2009). Povl Heiberg’s 1897 methodological study on the statistical method as an aid in therapeutic trials. Preventive Medicine 48: 600–603.

Heiberg P (1897). Studier over den statistiske undersøgelsesmetode som hjælpemiddel ved terapeutiske undersøgelser [Studies on the statistical study design as an aid in therapeutic trials]. Bibliotek for Læger 89: 1–40.

Hill AB (1937). Principles of Medical Statistics (1st edition) London: Lancet.

Hill AB (1955). Principles of Medical Statistics (6th edition) London: Lancet.

Hill AB (1963). Medical ethics and controlled trials. BMJ 1(5337): 1043-1049.

Hrobjartsson A, Gotzsche PC, Gluud C (1998). The controlled clinical trial turns 100 years: Fibiger’s trial of serum treatment of diphtheria. BMJ 317: 1243–1245.

Kaptchuk TJ (1998). Intentional ignorance: a history of blind assessment and placebo controls in medicine. Bulletin of the History of Medicine 72: 389–433.

Knuth DE (1997). The art of computer programming vol. 2 (3rd ed.) Reading, Mass: Addison-Wesley, pp. 149-189.

Lilienfeld AM (1982). Ceteris paribus: the evolution of the clinical trial. Bulletin of the History of Medicine 56: 1–18.

Mainland D (1948). Statistical methods in medical research. I. Qualitative statistics (enumeration data). Canadian Journal of Research 26: 1-166.

Mainland D (1963). Elementary medical statistics. 2nd ed. Philadelphia: WB Saunders.

Matthews JR (1995). Quantification and the quest for medical certainty. Princeton: University Press.

Medical Research Council (1948). Streptomycin treatment of pulmonary tuberculosis. BMJ 2: 769-782.

Neuhauser D, Provost SM, Provost LP (2020). It is time to reconsider factorial designs: how Bradford Hill and RA Fisher shaped the standard of clinical evidence. Quality Management in Health Care 29: 109-122.

Nardini C (2014). The ethics of clinical trials. E-cancer 8: 387-395.

Opinel A, Tröhler U, Gluud C, Gachelin G, Smith GD, Podolsky SH, Chalmers I (2011). Commentary: The evolution of methods to assess the effects of treatments, illustrated by the development of treatments for diphtheria, 1825–1918. International Journal of Epidemiology 42: 662–676.

Oppenheim AL (1977). Ancient Mesopotamia: portrait of a dead civilization. Chicago: University of Chicago Press, pp 99-100

Schulz KF, Grimes DA (2002). Generation of allocation sequences in randomised trials: chance, not choice. Lancet 359: 515–519.

Silverman WA, Chalmers I (2001). Casting and drawing lots: a time-honoured way of dealing with uncertainty and ensuring fairness. BMJ 323: 1467–1468.

Stigler S (2016). The seven pillars of statistical wisdom. Cambridge: Harvard University Press, p 166.

Stolberg M (2006). Inventing the randomized double-blind trial: the Nuremberg salt test of 1835. Journal of the Royal Society of Medicine 99: 642–643.

Tröhler U (2020). The French road to Gavarret’s clinical application of probabilistic thinking Part 2: Louis-Denis-Jules Gavarret. Journal of the Royal Society of Medicine 113: 360-366.

Waller LA (1997). A note on Harold S. Diehl, randomization, and clinical trials. Controlled Clinical Trials 18: 180–183.

Wessely S (2007). A defence of the randomized controlled trial in mental health. BioSocieties 2: 115–127.

Yoshioka A (1998). Use of randomisation in the Medical Research Council’s clinical trial of streptomycin in pulmonary tuberculosis in the 1940s. BMJ 317: 1220-1223.