Farewell V (2026). Mainland’s Elementary Medical Statistics (1952): a pivotal text in statistical pedagogy.

© Vern Farewell, MRC Biostatistics Unit, University of Cambridge. Email: vern.farewell@mrc-bsu.cam.ac.uk


Cite as: Farewell V (2026). Mainland’s Elementary Medical Statistics (1952): a pivotal text in statistical pedagogy. JLL Bulletin: Commentaries on the history of treatment evaluation (https://www.jameslindlibrary.org/articles/mainlands-elementary-medical-statistics-1952-a-pivotal-text-in-statistical-pedagogy/)


Introduction

Austin Bradford Hill, following in the footsteps of Major Greenwood, is regarded as the leading figure in the development of medical statistics in the UK during the middle years of the 20th Century. His influence also extended beyond the UK through his publications and international collaborations. In the US, Raymond Pearl was an early proponent of the importance of medical statistics, and a contemporary of Greenwood. Also, individuals such as Harold Dorn, William Cochran and Donald Mainland held senior posts in medical statistics or public health contemporaneously with Hill.

More specifically, and although the story of the development of clinical trials is not straightforward (Chalmers et al. 2012), Hill is sometimes characterised as “the father of the modern clinical trial” because of his early use of randomisation in trials, notably the well known MRC streptomycin trial (Medical Research Council 1948). Additionally, in epidemiology, the Bradford Hill Criteria for Causation have been widely used for exploring issues of causation in non-randomised epidemiological studies for many years. Both topics feature in a 1952 textbook, Elementary Medical Statistics (Mainland 1952), by Mainland, who at the time of publication was Professor of Medical Statistics in the Department of Preventive Medicine at New York University.

The purpose of this article is to highlight the contributions of Mainland’s text and also, and more specifically, to suggest why his writing reflects, much more strongly than Hill’s, the statistical thinking of Sir Ronald Fisher, who developed many of the methods for the design and analysis of experiments that dominated the statistical landscape during the period when Mainland and Hill were writing. In regard to the specific context of treatment allocation, which is discussed in this paper, this has also been addressed by Matthews (Matthews 2026a, 2026b, 2026c).

Historical Background to the State of Medical Statistics in 1952

In the early years of the 20th century, the most influential medical statistician in the UK was Major Greenwood (Farewell and Johnson 2014a). Greenwood played a major role in the early days of the UK Medical Research Council (initially Committee), chairing a Medical Committee that worked in tandem with the MRC Statistical Department headed up by the medically qualified John Brownlee (Farewell and Johnson 2014b) until his early death in 1927. After Brownlee’s death, MRC statistical activities were consolidated under Greenwood’s leadership with the Statistical Department moving to the London School of Hygiene and Tropical Medicine (LSHTM) where Greenwood had been appointed Professor.

When Greenwood, who was also medically qualified, moved into research activities, he sought out training and advice from Karl Pearson at University College London. While Pearson was primarily interested in what was then termed biometry, which largely focused on biological applications, Greenwood recognized that Pearson’s methods would be valuable in medical research. Further, John Brownlee was also known as a “disciple” of Pearson although he did not, apparently, have any contact with Pearson beyond his publications (Farewell and Johnson 2014b).

As a result, in the UK, early work in what might be termed modern medical statistics, in contrast to previous work which was largely demographic and related to official statistics, was primarily informed by Pearsonian methods. Concurrently however, R.A. Fisher was developing the foundations of modern statistics, in general, in the context of experimental work at the Rothamsted agricultural research establishment. Generally, and specifically through his involvement in the Royal Statistical Society, Greenwood would have known of Fisher’s work but his background probably made it unlikely that he would engage with the technical details. It can be noted, however, that Greenwood recognised the need for individuals with mathematical skills in statistical work, such as Isserlis and Newbold (Farewell et al. 2006), and used mathematical developments in some of his own research, even in his last paper on the topic of accidents (Greenwood 1950).

Greenwood’s protégé, who would succeed him at the LSHTM and in directing MRC statistical activity until the 1960s, was Austin Bradford Hill. Hill became involved in medical research under Greenwood’s direction after being discharged for medical reasons out of World War I and taking a correspondence degree in Economics at LSE while convalescing. As Armitage records (Armitage 1991a, 1995, 2003), Hill was not interested in the theoretical aspects of statistical research but was primarily concerned with seeing statistics properly used in medical research. To this end, Hill wrote a series of articles for The Lancet that formed the basis of his famous book, Principles of Medical Statistics (Hill 1937), first published in 1937 and with 11 further editions, the last in 1991. Hill and Fisher knew each other (Armitage 2003) but Fisher’s statistical methods are not mentioned in any detail in Hill’s writing. This is the case in spite of there being some suggestion of Fisher’s influence on Hill’s colleagues Woods and Russell (Farewell and Johnson 2012a) and, very directly, on Oscar Irwin (Armitage 1995) who was in Hill’s department and had studied with Fisher. Nevertheless, Fisher was complimentary of Hill’s book (Farewell and Johnson 2012b), although it did not reflect major themes of Fisher’s writings. A more comprehensive assessment of Hill’s attitude to Fisher’s work is given by Matthews (2026b).

Hill was internationally recognised as a leader in medical statistics and visited the US on a number of occasions. However, the development of medical statistics in the US followed a somewhat different course than that in the UK. While not providing here a comprehensive assessment of this development, an apparent difference relating to the influence of Fisher can be noted. Raymond Pearl was an early advocate of statistical methods in medical research in the US and he, like Greenwood of whom he was a contemporary, also studied with Karl Pearson (Greenwood 1941). And, Pearl’s book, Medical Biometry and Statistics (Pearl 1923) makes very little reference to Fisher, perhaps reflecting the antipathy between Pearson and Fisher at that time. This is also true in the third edition published in 1940 (Pearl 1940), although that edition does credit Fisher with determining the appropriate degrees of freedom for the chi-squared test, without mentioning that Fisher’s arguments corrected those of Pearson.

In the 1950s however, the US situation changed. After undergraduate study at Cambridge, William Cochran (see https://mathshistory.st-andrews.ac.uk/Biographies/Cochran/) had started a PhD there but was convinced that a post with Frank Yates at Rothamsted would be better, even though his Cambridge PhD time had led to his first paper (Cochran 1934) outlining what is now known as Cochran’s Theorem, the theoretical basis for the F-tests associated with Analysis of Variance. However, in 1939, Cochran left the UK for a post at the Iowa Statistical Laboratory where the American statistician George Snedecor was Director. Snedecor recognized the value of Fisher’s work early in his career and Fisher visited Iowa State in the summers of 1931 and 1936 (Cox 2005). Cochran’s extensive exposure to Fisherian statistics was brought into medical statistics when he accepted, in 1948, the Chair of the Johns Hopkins Department of Biostatistics, a post he held for 10 years. Notably, while there, he was involved in the trial of the polio vaccine, introducing some randomisation and arguing (unsuccessfully) to allow for over-dispersion due to clustering (Gehan and Lemak 1994). Around the same time, in 1950, Mainland, who had spent summers with Fisher in the 1930s, moved from Canada to the US to take up a post as Professor of Medical Statistics in New York University. As will be seen, Mainland’s book is imbued with Fisher’s statistical arguments. A third leading figure, Harold Dorn (Lilienfeld 2008), joined the US National Institutes of Health in 1948 and until 1961 was “considered, de facto if not de jure, chief statistician of the NIH” (Cornfield 1963). Harold Dorn, although his academic geneology connected to Pearson at University College London, was, by training, a sociologist interested in surveys. However, he worked with Fisher, and Egon Pearson, when he spent the academic year 1933-34 at University College London. Fisher, who had no specific interest in medical statistics himself (outside of genetics), was suddenly, therefore, very much an influence on medical statistics in the US.

In general, it can be conjectured that at this time, there was also a shift in the approach to medical statistics as mathematically well-trained statisticians, who would be familiar with statistical theory, began to move into the field. Notable examples were Jerry Cornfield in the US who joined Dorn at the NIH and Peter Armitage who joined Hill at the LSHTM. However, it would take a few years before the influence of such appointments would be felt, and this would certainly have been after the publication of Mainland’s 1952 book. In assessing the role of Mainland’s book therefore, it is the comparative experience and interests of the senior medical statisticians at that time that is most relevant.

Donald Mainland and Elementary Medical Statistics

Doug Altman (2020) has provided an excellent biographical article on Mainland and a basic summary of his career is given in the opening paragraph of Altman’s “Brief Biography” section

“Donald Mainland graduated in medicine at Edinburgh. He taught anatomy in Edinburgh and received a Doctor of Science degree there for his research in embryology and histology. In 1927, he moved to Winnipeg, Manitoba, Canada, and in 1930, at the age of 28, became Professor and Chairman of the Department of Anatomy at Dalhousie University. Even his earliest publications showed an interest in measurement issues, and foreshadowed an increasing interest in statistics. In 1938, he published his first book on statistics in medicine. In 1950, he became Professor of Medical Statistics at the New York University and shortly afterwards published his best known book, Elementary Medical Statistics. Thereafter, Mainland was a prolific and influential writer on statistical topics.

Mainland’s initial research work was largely laboratory-based, for example on the embryology of ferrets. However, a colleague in Winnipeg introduced him, in 1928, to Fisher’s book, Statistical Methods for Research Workers, first published in 1925 (Fisher 1925). Mainland quickly recognised the relevance and importance of statistical ideas to his own work. His interest grew and in 1934, after presumed correspondence with Fisher, he was invited by Fisher in April 1934 to visit Fisher that summer in London. There were additional summer visits in the late 1930s. It seems likely therefore that Mainland would have discussed with Fisher not only the content of Statistical Methods for Research Workers but also Fisher’s Design of Experiments (Fisher 1935).

Mainland’s first book on statistics in medicine, published in 1938 (Mainland 1938), focused primarily on laboratory data and, as Altman notes, “Fisher was thanked profusely in the preface”. More general coverage was provided in a 1948, 166-page, journal article, “Statistical methods in medical research. I qualitative statistics (enumeration data)” (Mainland 1948). This was towards the end of his time at Dalhousie in Halifax where he taught courses on statistics. These courses provided the basic material for his 1952 book, although it was published after he had moved to New York University. Parenthetically, Part II (Mainland and Sutcliffe 1953), dealing with sample sizes, appeared five years later and was only 11 pages in length.

Why did Mainland write Elementary Medical Statistics? One answer to this question is found in the first paragraph of the book’s section entitled “The Purpose of This Book”. It reads

“Because the neglect of statistical methods in medicine is due largely to faulty training of students, some medical schools are now trying to correct the fault, and this book is an outgrowth of such attempts. It is designed primary for students who are to become practitioners; but the principles and techniques are the same as those needed by investigators at the beginning of their careers, for they are common to all branches of medicine, from histology to psychiatry.”

But why did Mainland write a book with this purpose? There are perhaps three reasons. The first is that Mainland had for many years been a medical researcher but had early on been convinced of the importance of statistical thinking. As Chairman of the Department of Anatomy in the Dalhousie Medical School, he published a major 1938 textbook on anatomy but he also developed courses in statistics, first for students of anatomy and then for the medical school more generally. As Altman documents, however, Mainland’s move to a position as Professor of Medical Statistics was made so that he could concentrate on statistics, and have additional scope to become involved in clinical trials. Although he was a self-taught statistician, statistics was clearly now his primary interest. The second reason was that his road to an interest in statistics made him wary of the obvious solution to the “neglect of statistical methods in medicine” which was to make statistics part of the medical curriculum. As he writes in the first sentence of his book’s preface, “Those who have for many years stressed the importance of statistical thinking in medicine cannot be entirely happy to see statistics become established as a subject in the undergraduate curriculum …”. This sounds counter-intuitive or self-contradictory but his worry is that this will shift the focus to “board examinations which foster static pedagogy” and undermine the potential for statistics in medicine to break down “interdepartmental barriers” and to establish “a set of principles by which we can draw valid conclusions from experience”. The third and pragmatic reason is that he had developed course notes and this would provide a broader outlet for their use while propagating his views on medical statistics.

His views are reflected in the style of the book. Although he gives, for example, the details of how to perform chi-squared tests on categorical (described by Mainland as ‘enumeration’) data and t-tests on continuous (‘measurement’) data, he intersperses this with broad ranging discussions of when they should be used and how they should be interpreted. This intermingling in presentation is intentional, and important enough to him that he adopts it. This is clear because he signposts in the introductions of various chapters that a “student” might better study the pages dealing with the details of the methods before returning to the material which relates to broader issues regarding their application.

Some of these broader issues will be highlighted in the appendix for this article, which examines the chapters of Mainland’s book individually. However, the issue of randomisation, and causal inference more generally, is of particular interest at the present time to statisticians and medical researchers more generally and the coverage of this topic in Elementary Medical Statistics is examined separately in the next section.

Randomisation and Causal Inference

Randomisation per se is first mentioned in Chapter 2, titled “On Looking at Evidence”, under the heading “Lack of Objectivity” (p.27). Mainland uses the 1948 publication reporting the results of the MRC streptomycin trial in pulmonary tuberculosis to illustrate how objectivity must be planned into an investigation. His first observation is that, prior to the trial, there were indications that streptomycin might be more beneficial than alternative treatments and that, therefore, physicians might want to preferentially give this treatment to certain patients. Mainland, at this point, says simply “This risk was avoided by the sampling method that will be described in Chapter 4″. That method is randomisation. A key feature of this procedure is likely to be independence of assignment. His second comment on objectivity relates to blinded reading of x-ray films, which was also a feature of the MRC trial.

It is in the section “On Planning a Simple Experiment” in Chapter 4 that the term “Random Sampling” appears as a sub-heading. In fact, ‘random’ is a very broad word but, in the context of experimentation, it can be regarded as arising when experimental subjects, or items, are assigned a treatment by an “objective impersonal procedure” (Cox 1958). In this section on random sampling, after discussing the value of systematically ensuring a balance between two treatment groups with respect to some known risk factors in reducing variability, Mainland writes (p.103) that this should be done “not as far as possible, but as far as it is convenient and useful [his emphasis]. To deal with the inevitable differences in other factors that might influence outcome, which cannot be distributed equally to the two treatments, he writes “we must allocate them in such a way that we can tell what allowance to make for the inequalities”. Further, “The only way to do this is to make chance decide for us”.

In these two excerpts, Mainland is addressing, respectively, the two aspects of randomisation that are outlined by David Cox in his book Planning of Experiments (Cox 1958). These are:

  • “that in a large experiment, it is very unlikely that the estimated treatment effect will be appreciably in error”
  • “that the random error of the estimated treatment effects can be measured and their level of statistical significance examined, taking into account all possible forms of uncontrolled variation subject to (1)”

The first aspect relates to allowing the definition of an unbiassed estimator of a defined treatment difference, the estimand of interest. Generally, a key feature of this is the avoidance of confounding. The second relates to the properties of the estimation process.

Alternate Allocation

In 1952, in clinical trials, the first purpose of randomisation was increasingly recognised but there was frequent reference to a supposed method of randomisation based on alternate allocation of patients to the two treatments under study. Matthews has provided a careful investigation of the history of this method (Matthews 2026a, 2026b, 2026c) and highlights Mainland’s criticisms of this methodology. In the section in Chapter 4 on random sampling in Elementary Medical Statistics, Mainland simply highlights that alternate allocation “is not strictly equivalent to random sampling” and gives some possible examples of systematic differences that might be introduced. He returns to this topic in a later chapter (p.268): here he gives a further example of how an unknown “rhythm”, due to differential risks and patient numbers on days of the week, could bias results. He concludes by saying “When an investigator employs a method that is not strictly random as if it were equivalent to a random technique, the onus is on him to prove it justifiable by experiment, not argument; and this, even if possible, would entail a very large investigation”.

These arguments reflect Mainland’s belief, outlined in the section “How Randomization Acts” in Chapter 4 (p.104), that “Random sampling is the only way to equalize the risk of hidden bias” [his emphasis].

As Matthews (2026c) highlights, Mainland’s strong aversion to alternate allocation, reflecting Fisher’s influence, was not shared by other medical statisticians, notably Hill, at least during this time period. Armitage writes, about Hill’s later (1990) reflection on the MRC streptomycin trial, that Hill felt “that alternations of successive cases might have been successful if strictly adhered to, but he (Hill) wrote ‘it’s a very big IF’”. However, Matthews very appropriately highlights that Hill’s writings largely do not reflect this perspective and he often presents it as a credible method of treatment allocation.

Mainland’s distaste for an approach that carries inferential risks is more generally expressed in his otherwise favourable review (Mainland 1961) of the book Controlled Clinical Trials (Hill 1960) which is a collection of papers, edited by Hill, given at a 1959 conference on clinical trials in Vienna. While recommending the book, Mainland did introduce one caution. He suggested, referring to a comment by Hill, that the reader should not be “beguiled by a statistician’s understatement (p.17) that ‘however carefully a trial has been planned occasionally things will go wrong.’ ” He then highlights a clinician’s contribution (p. 166) that “a clinical trial is a serious matter and is not to be undertaken lightly …It is important to get the answer right, and this means that an enormous amount of trouble has to be taken in the planning, execution and publication of a trial”. This is clearly Mainland’s view.

Error Estimation

Mainland’s insistence on randomisation, or experimental justification of any alternative, could be said to reflect the high priority Fisher gives to randomisation in his writing. However, in the context of agricultural experiments in which Fisher primarily worked, Fisher seems to simply assume the value of randomisation to prevent biased treatment estimates and focusses in much more detail on the issue of error estimation. For example, in his book, Design of Experiments (Fisher 1935) on page 72, Fisher’s section on “Bias in Systematic Arrangements” illustrates how systematic arrangements thought to bring balance to treatment assignment can lead to incorrect estimates of error for the treatment comparison. The error variance is larger than it should be when the systematic arrangement serves to eliminate the variation due to a particular risk factor but the analysis then assumes that the treatment assignment is random over this factor. If the systematic arrangement happens, incorrectly for whatever reason, to increase imbalance with respect to the risk factor, then the error variance can be artificially reduced. Thus, the control of confounding and assumptions about the error structure themselves become confounded. This discussion reflects Fisher’s earlier arguments, pages 20-24, that randomisation provides a physical basis for the validity of significance tests concluding “the simple precaution of randomisation will suffice to guarantee the validity of the test of significance, by which the result of the experiment is to be judged”.

Mainland certainly affirms this value of randomisation. Immediately following his section “How Randomization Acts”, Mainland addresses “Interpretation after Random Sampling”. The point he makes here is that the additional benefit of randomisation is that “ …we can, after the experiment, use our knowledge of chance to interpret the results”. This clearly relates to the second aspect of randomisation. However, Mainland does perhaps pay more attention to the possibility of bias from unknown factors. This is addressed in the section “How Randomization Acts” as discussed earlier and he again addresses it in the section “Interpretation after Random Sampling” where he addresses the possibility of “something else” influencing a significance test result. He writes, “If treatments V and W were not allocated at random, the something else might be …some bias due to unknown factors”. In addition, Mainland’s illustrations of the problems with non-randomised studies, discussed in the previous section, relate to systematic arrangements leading to bias in the estimation of the treatment effect, not, or not solely, in the estimation of the error variance of that effect.

But, and it is an important but, Mainland’s recognition of the use of randomisation to justify tests of significance contrasts sharply with the writings of Hill, who, as far as it appears, pays little or no attention to this matter. As Armitage (1991b) says, and Matthews (2026b), following Silverman and Chalmers (2001), highlights, Hill had little interest in statistical theory in general. Indeed, Matthews goes further to argue that (a) Hill’s attitude was not just indifference to “technicalities” but was actively dismissive and (b) this led to his failure to see the importance of randomisation in drawing inferentially reliable conclusions from clinical trials.

Alternate Allocation and Error Estimation

In a general sense, Armitage (2003) argues that Hill’s views of randomisation were influenced by the context in which he advocated it. Basic principles of comparative experimentation were widely accepted in the agricultural setting within which Fisher worked but fundamental concepts such as the need for simultaneous control were still having to be emphasised in the medical context. Alternate allocation might have been seen as the simplest thing to advocate for this control to those for whom randomisation might have seemed difficult to implement or somehow incompatible with medical care (see Matthews 2026a, 2026b, 2026c).

Mainland’s insistence on randomisation in clinical trials may have been, as Matthews argues, primarily motivated by his understanding of the broader arguments for randomisation, and, perhaps, because his original medical research was in the context of laboratory studies in anatomy, within which the introduction of randomisation would have been less problematic than in trials. Combined with his personal exposure to Fisher, his prescient advocacy of “proper” randomisation in trials is as understandable as it was valuable.

With respect to the risks of alternate allocation, the primary one is surely the introduction of accidental bias as discussed earlier, or perhaps, with the same result, selection bias if assignment is not concealed. The potential for errors in variance estimation is, however, also important. The assumption of no bias must be combined with the assumption of an appropriate sampling frame for statistical inference if alternate allocation is to be appropriate.

Randomisation supplies, to use the terminology of Yates (Yates 1939), an ‘objective’ sampling frame and the validity of significance tests and confidence intervals depends either on this or on an assumption that the data can be regarded as coming from a suitable sampling frame. As indicated in a personal communication to Iain Chalmers (Chalmers 2001), Armitage recognised that alternation can go wrong in this respect if successive responses are not statistically independent but he doubted “whether the effect would be important”. In spite of this, one suspects that Armitage, and Mainland would have agreed that the safest approach is true randomisation.

The importance of an objective sampling frame is that it justifies the probability calculations used in significance tests. Independence of observations is critical and this is generally combined with distributional assumptions. Probability calculations may be numerically “exact” as for Fisher’s test for  tables based on binomial distribution assumptions, although these can be derived from first principles as well, or for t-tests or F-tests based on initial assumptions that the data are drawn from normal distributions. Mainland mentions such calculations throughout his book while also pointing out the situations when approximations are practically useful and accurate, for example, for chi-squared tests for contingency tables.

The additional advantage given through randomisation however is that significance testing can be performed based only on the randomisation distribution without the need for distributional assumptions. Fisher, it seems, regarded this as less important practically as evidenced in his book The Design of Experiments where he writes (pages 50-51 of the 1st edition (Fisher 1935), in referring to a test of the difference in two normal means

“There has, however, in recent years, been a tendency for theoretical statisticians, not closely in touch with the requirements of experimental data, to stress the element of normality in the hypothesis tested, as if it were a serious limitation to the test provided. It is, indeed, demonstrable that, as a test of this hypothesis, the exactitude of “Student’s” t test is absolute. It may, nevertheless, be legitimately asked whether we should obtain a materially different results were it possible to test the wider hypothesis which merely asserts that the two series are drawn from the same populations, without specifying that this is normally distributed.”

Frank Yates, Fisher’s close collaborator, writes consistently with this view, in a slightly different context, that questioning the validity of a random experiment “because the original material is not normally distributed” must “be regarded rather as a debating point than a serious objection” on page 441 of a 1939 paper (Yates 1939).

Rosenberger, Uschner, and Wang (2019), noting current computing capabilities, nevertheless makes a strong case for the use of randomisation tests, even or even especially in more complex trials. Also, the use of a randomisation test might be seen as a simple example of “assumption lean inference” (Vansteelandt and Dukes 2022) which more generally argues for robustness of inference to allow for misspecified models. In addition, minimally, as is also highlighted by Rosenberger, the examination of randomisation tests can be very helpful in understanding the nature of an experiment and the essential aspects of any analysis that is adopted, whether a formal randomisation test is used, or even practical. An earlier illustration of this is given by Nelder (1964). This is also consistent with the remark of Cox (1982) concerning clinical trials: “While the final analysis may not be based explicitly on the randomization distribution, it is necessary that there should be some broad correspondence with randomization theory”.

However, all these authors would also agree that some medical studies are, and may have to be, observational and do not involve randomisation of comparison groups. In these studies, the two problems of bias and error estimation cannot be removed by randomisation so other arguments must be made. These are explored in the next section.

Non-randomised Studies and Causal Inference

On page 5 of Elementary Medical Statistics, Mainland says that there is “no absolute division between observation with experiment and observation without experiment”. Later in the same chapter on page 37, he deals specifically with the issue of “Causal Interpretation”. While he describes this section as a “glance” at the topic, it is noteworthy that the flavour of the discussion is totally consistent with the, now famous, criteria for causation in the presence of a demonstrable association that were given by Austin Bradford Hill in a 1965 paper (Hill 1965). This 1965 paper builds on or is closely linked with prior work by Yerushalmy and Palmer (1959), the 1964 US Surgeon General’s report on Smoking and Health (United States Public Health Service 1964) and an earlier 1962 paper by Hill (1962).

Mainland’s discussion first acknowledges the issues raised due to the multiplicity of causes that may play a role in a disease process. He then specifically, referencing Greenwood (1944) who wrote that any causal interpretation of an association must be credible biologically, acknowledges that this depends on the state of knowledge concerning the disease process, and, specifically, that the time relationship between putative cause and effect must be sensible. Summarising later, he describes this as showing “why” a demonstrable association occurred.

It is certainly noteworthy that Mainland presents such a discussion well before it gained the prominence it did in the later part of the 1950s, through discussions of Doll and Hill’s work on the link between smoking and cancer. Parenthetically, it can also be noted that, in this discussion, Mainland alludes to the plausibility of a link between cigarettes and mortality from heart disease, a link which leads to more deaths than the lung cancer link although the relative risk is lower.

With respect to the statistical methods to be used in observational studies, Mainland writes, following his observation of no absolute division between observational and experimental data “ One of the most important techniques in Fisher’s Statistical Methods was illustrated in the study of rainfall records; and the methods of sampling and analysis of observational data in public health are now being changed in accordance with the new methods”. Therefore, it is clear that Mainland accepted the validity of statistical methods, i.e. that they have known error properties, in non-randomised studies and this is separate to the issue of drawing causal inferences from such studies. It is not the validity of a significance test that is questionable, it is its meaning, i.e. the broader inference, that can be drawn from them.

With respect to the sampling frame that underlies significance testing in observational studies, Fisher is sometimes characterized as a frequentist, a term that relates to the notion of the long-run behaviour of statistical procedures. Cox (2016) however writes:

“Fisher is often thought of as frequentist in his thinking, but this is rather misleading. He strongly emphasized that when probability was used to describe what underlay a set of data, he did not have in mind probability as a limiting frequency over a large number of repetitions. Rather, by probability Fisher meant a proportion in a hypothetical infinite population, the data being regarded as a random sample from that hypothetical population. This in particular allowed the associated methods to be applied to situations, such as studies of literary authorship, in which direct replication of the data was inconceivable.”

Discussion

A remarkable career

In his biography of Mainland, Altman (2020) writes “Donald Mainland’s career was remarkable, with a unique move from anatomy to medical statistics and clinical trials”. Mainland is not unique among early medical statisticians in moving from medicine to medical statistics. Major Greenwood (Farewell and Johnson 2016) and Raymond Pearl (Greenwood 1941) are other, earlier, examples. Indeed, while the pathway from medicine to medical statistics is not as evident currently, it is not unknown and probably still brings some particular advantages. However, there is a uniqueness in Mainland’s move from the field of anatomy where experimentation was a central feature of research. This background might be a factor that contributes in different ways to the continued relevance of Mainland’s book and was surely a factor in his recognition that Fisher’s work on experimental design and statistical inference should inform medical research.

The remarkableness of Mainland’s career is not solely linked to his appreciation of Fisher however. As evidenced in his treatment of randomisation and causal inference, Mainland was concerned with many aspects of the treatment of medical data and this is even more evident when examining the full contents of his 1952 book which is done in the Appendix to this article. For example, as Neuhauser, Provost, and Provost (2020) and Matthews (2026c) point out, factorial designs, which Fisher promoted, were identified as important by Mainland but little mentioned by other medical statisticians at that time. Furthermore, Mainland covers two topics that were little discussed at that time but which are currently major topics of interest in medical statistics. These are intercurrent events discussed in Chapter 4 and mixture models in Chapter 5. The brief remarks on these in the Appendix are included here as well.

Intercurrent Events

There is a brief subsection in the section on experimental planning in Chapter 4 on what Mainland calls intercurrent events. This would appear to be the first use of this now more commonly used term, often in the context of discussions of intention-to-treat (Chalmers et al. 2023a, 2023b, to refer to the possibility of events taking place after treatment which may influence the patient’s outcome. The given examples of these events are treatment supplementation, change of treatment, accidents or diseases which may or may not be associated with the condition under study, the suspension of treatment for the patient’s business or domestic affairs and loss of follow-up of a patient for a variety of possible causes including death. Mainland summarises what should be done as follows: “In deciding what should be done with data from any such patients the criterion must always be whether their inclusion or omission would introduce bias. Unless the appropriate decision is obvious, the best plan is to analyze all the data together, then to analyze the special cases and the main series separately”. The inclusion of such a section in 1952 seems particularly remarkable.

Mixture models

A short subsection of interest in Chapter 5, titled “Heterogeneous Samples”, makes the general point about being aware of known heterogeneity but focusses on the example of dental caries where patients with zero-caries may have a high or low frequency independent of the frequencies of non-zero categories of caries. This can be seen to be an early example of the possible value of mixture models for zero-heavy count data and other similar data (Farewell et al. 2017).

Final remarks

As indicated previously, the writing of Elementary Medical Statistics was motivated by Mainland’s interest in the teaching of statistics to students of medicine. However, as discussed in Section 3, Mainland had reservations about its formal introduction into medical training. The debate on how medical statistics is best introduced continues in medical schools today although, due to Mainland and others, its importance is unquestioned. Reflection on Mainland’s writings can certainly contribute to this debate.

Finally, I should like to note the influence of Mainland on the late Tony Johnson (Farewell 2023) with whom I had the privilege of writing numerous articles on the history of medical statistics, some of which are referenced in this paper. Tony entered the field of medical statistics from mathematics. A key book for him, when he needed a rapid introduction to the field, was Elementary Medical Statistics by Mainland, likely the second edition. It was therefore part of the foundation of Tony’s long career in medical statistics and, I am sure, the work of many others has also been influenced by the passion of Mainland in making statistical thinking a key component of medical research.

Acknowledgements

I thank Iain Chalmers for his suggestion that Mainland’s 1952 book warranted a detailed examination and for his encouragement throughout the writing of this article. I also thank Robert Matthews, Daniel Farewell and Agnes Herzberg for helpful comments and discussions, and John Matthews for a careful and thoughtful review of the manuscript.

References

Altman D (2020). Donald Mainland: Anatomist, Educator, Thinker, Medical Statistician, Trialist, Rheumatologist. Journal of the Royal Society of Medicine 113: 28–38.

Armitage P (1991a). Bradford Hill and the Randomized Controlled Trial. Pharmaceutical Medicine 6: 23–37.

Armitage P (1991b). Obituary: Sir Austin Bradford Hill, 1897-1991. Journal of the Royal Statistical Society Series A 154: 482–84.

Armitage P (1995). Before and After Bradford Hill: Some Trends in Medical Statistics. Journal of the Royal Statistical Society, Series A 158: 143–53.

Armitage P (2003). Fisher, Bradford Hill, and randomization. International Journal of Epidemiology 32: 925–28.

Chalmers I (2001). Comparing Like with Like: Some Historical Milestones in the Evolution of Methods to Create Unbiased Comparison Groups in Therapeutic Experiments. International Journal of Epidemiology 30: 1156–64.

Chalmers I, Dukan E, Podolsky S, Davey-Smith G (2012). The Advent of Fair Treatment Allocation Schedules in Clinical Trials During the 19th and 20th Centuries. Journal of the Royal Society of Medicine 105: 221–27.

Chalmers I, Matthews R, Glasziou P, Boutron I, Armitage P (2023a). Trial Analysis by Treatment Allocated or by Treatment Received? Origins of the ‘Intention-to-Treat Principle’ to Reduce Allocation Bias: Part 1. Journal of the Royal Society of Medicine 116: 343–50.

Chalmers I, Matthews R, Glasziou P, Boutron I, Armitage P (2023b). Trial Analysis by Treatment Allocated or by Treatment Received? Origins of the ‘Intention-to-Treat Principle’ to Reduce Allocation Bias: Part 2. Journal of the Royal Society of Medicine 116: 386–94.

Cochran WG (1934). The Distribution of Quadratic Forms in a Normal System, with Applications to the Analysis of Covariance. Mathematical Proceedings of the Cambridge Philosophical Society 30: 178–91.

Cornfield J (1963). Harold Fred Dorn. American Statistician 17: 53.

Cox DF (2005). Snedecor, George Waddel. In Encyclopedia of Biostatistics, 5009–11. Chichester, UK: Wiley.

Cox DR (1958). Planning of Experiments. New York: Wiley.

Cox DR (1982). A Remark on Randomization in Clinical Trials. Utilitas Mathematica 21A: 245–52.

Cox DR (2016). Some Pioneers of Modern Statistical Theory: A Personal Reflection. Biometrika 103: 747–59.

Farewell V (2023). Obituary: Anthony Leonard Johnson (1943-2022). Statistics in Medicine 42: 597–99.

Farewell V, Johnson A (2012a).  The first British textbook of medical statistics. Journal of the Royal Society of Medicine 105: 446–48.

Farewell V, Johnson A (2012b). The Origins of Austin Bradford Hill’s Classic Textbook of Medical Statistics. Journal of the Royal Society of Medicine 105: 483–89.

Farewell V, Johnson T (2014a). Major Greenwood’s early career and the first departments of medical statistics. Statistics in Medicine 33: 2161–73.

Farewell VT, Johnson TL (2014b). Commentary: Dr John Brownlee MA, MD, DSc, DPH (Cantab), FRFPS, FSS, FRMetS (1868-1927), public health officer, geneticist, epidemiologist and medical statistician. International Journal of Epidemiology 42: 935–43.

Farewell V, Johnson T (2016). Major Greenwood (1980-1949): The Biography. Statistics in Medicine 35: 5533–35.

Farewell V, Johnson T, Armitage P (2006). ‘A Memorandum on the Present Position and Prospects of Medical Statistics and Epidemiology’ by Major Greenwood. Statistics in Medicine 25: 2161–77.

Farewell VT, Long DL, Tom BDM, Yiu S, Su L (2017). Two-Part and Related Regression Models for Longitudinal Data. Annual Review of Statistics and Its Applications, 669–90.

Fisher RA (1935). The Design of Experiments. London, UK: Oliver; Boyd.

Fisher RA (1925). Statistical Methods for Research Workers. London, UK: Oliver; Boyd.

Gehan EA, Lemak NA (1994). Statistics in Medical Research: Developments in Clinical Trials. New York, USA: Plenum.

Greenwood M (1941). Obituary: Raymond Pearl. Journal of the Royal Statistical Society 104: 94–96.

Greenwood, M (1944). Mr. Shaw on doctors. British Medical Journal, 2: 570-571.

Greenwood M (1950). Accident Proneness. Biometrika 37: 24–29.

Hill AB (1937). Principles of Medical Statistics. London, UK: Lancet.

Hill AB, editor (1960). Controlled Clinical Trials. Springfield, Illinois: Charles C Thomas.

Hill AB (1962). The Statistician in Medicine. Journal of the Institute of Actuaries 88(2): 178–91.

Hill AB (1965). The Environment and Disease: Association or Causation. Proceedings of the Royal Society of Medicine 58(5): 295–300.

Lilienfeld DE (2008). Harold Fred Dorn and the First National Cancer Survey (1937-1939): The Founding of Modern Cancer Epidemiology. American Journal of Public Health 98(12): 2150–58.

Mainland D (1938). The Treatment of Clinical and Laboratory Data: An Introduction to Statistical Ideas and Methods for Medical and Dental Workers. Edinburgh, UK: Oliver; Boyd.

Mainland D (1948). Statistical Methods in Medical Research i. Qualitative Statistics (Enumeration Data). Canadian Journal of Research 26: 1–166.

Mainland D (1952). Elementary Medical Statistics, 1st Edition. Philadelphia, USA: WB Saunders Company.

Mainland D (1961). Review: Controlled Clinical Trials, Ed: A. B. Hill. Journal of the American Statistical Association 56: 758–59.

Mainland D, Sutcliffe MI (1953). Statistical Methods in Medical Research. II. Sample Sizes Required in Experiments Involving all-or-none Responses. Canadian Journal of Medical Sciences 53: 406–16.

Matthews RAJ (2026a). The problematic history of randomised controlled trials, Part 1: presumption and confusion on the road to randomisation. Journal of the Royal Society of Medicine 119(1): 25-30.

Matthews RAJ (2026b). The problematic history of randomised controlled trials, Part 2: Hill’s “pragmatic” view of randomisation and its origins. Journal of the Royal Society of Medicine doi:10.1177/01410768261419045.

Matthews RAJ (2026c). The problematic history of randomised controlled trials, Part 3: Mainland, Hill and the future of RCTs. Journal of the Royal Society of Medicine doi:10.1177/01410768261419062.

Medical Research Council (1948). Streptomycin Treatment of Pulmonary Tuberculosis. British Medical Journal 2: 769–82.

Morgan KL, Rubin DB (2012). Rerandomization to Improve Covariate Balance in Experiments. The Annals of Statistics 40: 1263–82.

Nelder JA (1964). The Analysis of Randomized Experiments with Orthogonal Block Structure. I. Block Structure and the Null Analysis of Variance. Proceedings of the Royal Society of London, A 283: 147–62.

Neuhauser D, Provost SM, Provost LP (2020). It Is Time to Reconsider Factorial Designs: How Bradford Hill and RA Fisher Shaped the Standard of Clinical Evidence. Quality Management in Health Care 29(2): 109–22.

Pearl R (1923). Medical Biometry and Statistics, 1st Edition. Philadelphia, USA: WB Saunders Company.

Pearl R (1940). Medical Biometry and Statistics, 3rd Edition. Philadelphia, USA: WB Saunders Company.

Rosenberger WF, Uschner D, Wang Y (2019). Randomization: The Forgotten Component of the Randomized Clinical Trial. Statistics in Medicine 38(1): 1–12.

Silverman WA, Chalmers I (2001). Casting and drawing lots: a time-honoured way of dealing with uncertainty and ensuring fairness. British Medical Journal, 323(7237): 1467–68.

Snedecor GW (1946). Statistical Methods Applied to Experiments in Agriculture and Biology. Ames, Iowa, USA: Iowa State College Press.

United States Public Health Service. 1964. Smoking and Health: Report of the Advisory Committee to the Surgeon General of the Public Health Service. Washington, DC: Department of Health, Education; Welfare.

Vansteelandt S, Dukes O (2022). Assumption-Lean Inference for Generalised Linear Model Parameters (with Discussion). Journal of the Royal Statistical Society, Series B 84: 657–739.

Yates F (1939). The Comparative Advantages of Systematic and Randomized Arrangements in the Design of Agricultural and Biological Experiments. Biometrika 30: 440–66.

Yates F (1948). Contribution to discussion of “The validity of comparative experiments” by FJ Anscombe. Journal of the Royal Statistical Society, Series A 111: 204–5.

Yerushalmy J, Palmer CE (1959). On the Methodology of Investigations of Etiologic Factors in Chronic Diseases. Journal of Chronic Disease 10(1): 27–40.