Numbers are needed to record the results of fair tests of treatments, and tables and graphs are used to describe the characteristics and experience of groups of patients, the treatment they have received, and quantitative estimates of treatment effects.
Using quantification in testing treatments
It was not until the early 18th century that numbers began to be used to assess the effects of medical treatments (Nettleton 1722a; 1722b; 1722c; Jurin 1724; and Huth 2005; Boylston 2010; Boylston 2012). This development occurred chiefly, but not exclusively, in Britain (Tröhler 2010). In 1732, Francis Clifton published a book entitled ‘The state of physick, ancient and modern, briefly considered: with a plan for the improvement of it’ (Clifton 1732). He pointed out that, instead of assessing the worth of therapies by whether they accorded with theories, physicians needed to base their judgements about the effects of treatments on a sufficient number of their own (or otherwise testified) observations, organised in tables. A series of authors emphasised similar principles throughout the 18th century. Quantification began in the 1720s with comparisons of death rates following variolation (inoculation) against smallpox with death rates associated with the disease itself, establishing the relative safety of the former. In this field, numbers were used throughout the century in various European countries, and they played a role in the introduction of vaccination around 1800. By then numerical data had become important criteria for assessing new therapies in surgery and medicine (Tröhler 2010).
Using tables and graphs to present treatment comparisons
In the British Army, John Rollo (1781) may have been the first to use tables to give a detailed account of all the cases he had treated in a military hospital in Barbados. When he became chief of the Hospital of the Ordnance (artillery) at Woolwich, he published a hospital report based on the same principles (Rollo 1801). Richard McCausland (1783) published his comparative studies of various treatments for intermittent fevers in statistical tables (Maehle 2011). Thomas Dickson Reide’s (1793) ‘View of the diseases of the Army’ was full of tabular compilations and arithmetical calculations, as were James McGrigor’s reports (1801; 1815). Numerical data were quite often used to calculate simple ratios. For example, William Falconer (1807) calculated success:failure ratios to compare the results of his practice in Bath with those published earlier by Rice Charleton (1770).
Replacing certainties with probabilities
What were the motives for quantifying and tabulating observations? What were the numbers intended to convey? A book by George Fordyce published in 1793 provides an initial answer: Its title was ‘An attempt to improve the evidence of medicine’ (Fordyce 1793), published in the Transactions of a Society for the Improvement of Medical and Chirurgical Knowledge. Quantification of experience was aimed at “increasing the certainty of medicine.” John Millar (1798) observed that “Where mathematical reasoning can be had, it is a great folly to make use of any other, as to grope for a thing in the dark, when you have a candle standing by you”; and his dispensary-physician colleague William Black (1789) noted that: “However it may be slighted as an heretical innovation, I would strenuously recommend Medical Arithmetick as a guide and compass through the labyrinth of therapeutick.” Reide (1793) justified this approach using a simple analogy:” How ridiculous would it appear [for a merchant] to judge of the advantages or disadvantages of particular branches of commerce from reasoning and conjecture whilst the result can be reduced to certainty by keeping regular accounts, and balancing them at stated periods.”
Methodological questions were indeed eagerly debated in 18th century British medicine. Among the issues was that of certainty versus the slowly growing notion of statistical probability. In 1772, James Lind, then chief of the 1000-bed Haslar Naval Hospital, summarized the transition from belief in an absolute authority to reliance on relative statistics: “A work indeed more perfect, and remedies more absolutely certain might perhaps have been expected from an inspection of several thousand…patients.“ But even such facts remained partial in his view, and he concluded with the remarkable insight that “for though they may for a little, flatter with hopes of greater success, yet more enlarged experience must ever evince the fallacy of all positive assertions in the healing art” (Lind 1772, p v-vi).
More outspokenly, John Haygarth, with the help “of an ingenious friend, Mr Dawson, a truly mathematical genius”, calculated probabilities of escaping infection with ‘continuous fever’ or smallpox. On the basis of results “computed arithmetically by the doctrine of chances, according to the data”, Haygarth indicated that immediate isolation of patients with smallpox and fever in specific wards in Chester was required (Haygarth 1784, p 26-28).
With the availability of more and more numerical data, numbers began to be pitted against numbers at the beginning of the 19th century. How did people judge whether treatment comparisons were trustworthy and meaningful? For example, during the debates about bloodletting for the treatment of fevers around 1800, statistics were widely used on both sides. It became clear that these data needed interpretation. In 1813, Thomas Mills had re-introduced copious bloodletting and purging at the Dublin Fever Hospital. The statistics comparing his mortality rates with those of other physicians who had hardly used bloodletting were reprinted in the review of his Essay on the utility of blood-letting in fever (Mills 1813). This elicited the following comment:
presuming….these are candid and correct statements, we may deem them potent arguments in favour of the advantages of the anti-phlogistic [bloodletting and purging] treatment of fever (Edinburgh Medical and Surgical Journal 1813).
Besides the issue of honesty the question of bias was raised – of the need to compare the like with like. For instance, the Monthly Review wrote that Mills’ work left “a rather painful impression on our minds,” for these impressive results might be explained by the type of patients treated by Mills rather than by the therapy he had applied (Monthly Review 1814, p314). This issue was also raised in relation to interpreting statistics about the timing of amputation (immediate vs. delayed), and comparisons of treatments for fever in the Army and Navy (Edinburgh Medical and Surgical Journal 1813, p 458-459).
A writer in the Edinburgh Medical and Surgical Journal in 1813 stressed that, if one could assume the data to have been honestly assembled and presented by both sides, the only way out of the maze would be through “extensive comparative experiments” (Edinburgh Medical and Surgical Journal 1813).
During the 19th century there was gradual recognition that it is important to record the extent of uncertainty associated with estimates of treatment differences. In particular, Jules Gavarret, a mathematically inclined Parisian physician, pointed out the need to analyse treatment comparisons of sufficient size and to calculate the ‘limits of oscillation’ (variation) associated with statistical estimates of treatment differences (Gavarret 1840). However, this practice did not really become widely adopted until the second half of the 20th Century (see Explanatory Essay 3.2).
The text in these essays may be copied and used for non-commercial purposes on condition that explicit acknowledgement is made to The James Lind Library (www.jameslindlibrary.org).