Untrustworthy treatment comparisons are those in which biases, or the play of chance, or both result in misleading estimates of the effects of treatments. Fair treatment comparisons avoid biases and reduce the effects of the play of chance.
Failure to test theories about treatments in practice is not the only preventable cause of treatment tragedies. Tragedies have also occurred because the tests used to assess the effects of treatments have been unreliable and misleading. The principles of fair tests have been evolving for at least a millennium (list records coded Principles of Testing) – and they continue to evolve today (Savovic et al. 2012; Jefferson et al. 2014).
For example, in the 1950s, theory and poorly controlled tests yielding unreliable evidence suggested that giving a synthetic sex hormone, diethylstilboestrol (DES), to pregnant women who had previously had miscarriages and stillbirths would increase the likelihood of a successful outcome of later pregnancies. Although fair tests had suggested that DES was useless, theory and the unreliable evidence, together with aggressive marketing, led to DES being prescribed to millions of pregnant women over the next few decades. The consequences were disastrous: some of the daughters of women who had been prescribed DES developed cancers of the vagina, and other children had other health problems, including malformations of their reproductive organs and infertility (Apfel and Fisher 1984).
Problems resulting from inadequate tests of treatments continue to occur. Again, as a result of unreliable evidence and aggressive marketing, millions of women were persuaded to use hormone replacement therapy (HRT), not only because it could reduce unpleasant menopausal symptoms, but also because it was claimed that it would reduce their chances of having heart attacks and strokes. When these claims were assessed in fair tests, the results showed that, far from reducing the risks of heart attacks and strokes, HRT increases the risks of these life-threatening conditions, as well as having other undesirable effects (McPherson 2004).
These examples of the need for fair tests of treatments are a few of many that illustrate how treatments can do more harm than good. Improved general knowledge about fair tests of treatments is needed so that – laced with a healthy dose of scepticism – we can all assess claims about the effects of treatments more critically. That way, we will all become more able to judge which treatments are likely to do more good than harm.
Fair tests entail taking steps to reduce the likelihood that we will be misled by the effects of biases of various sorts. Those addressed in the James Lind Library include design bias, allocation bias, co-intervention bias, observer bias, analysis bias, biases in assessing unanticipated effects, reporting bias, biases in systematic reviews, and researcher biases and fraud.
Essays on taking account of the play of chance address recording and interpreting numbers, quantifying uncertainty, and reducing the play of chance using meta-analysis.
The text in these essays may be copied and used for non-commercial purposes on condition that explicit acknowledgement is made to The James Lind Library (www.jameslindlibrary.org).
Apfel RJ, Fisher SM (1984). To do no harm: DES and the dilemmas of modern medicine. New Haven, Ct: Yale University Press.
Jefferson T, Jones MA, Doshi P, Del Mar CB, Hama R, Thompson MJ, Spencer EA, Onakpoya I, Mahtani KR, Nunan D, Howick J, Heneghan CJ (2014). Neuraminidase inhibitors for preventing and treating inﬂuenza in healthy adults and children. Cochrane Database of Systematic Reviews 2014, Issue 4. Art. No.: CD008965. DOI:10.1002/14651858.CD008965.pub4.
McPherson K (2004). Where are we now with hormone replacement therapy? BMJ 328:357-358.
Savović J, Jones HE, Altman DG, Harris RJ, Jüni P, Pildal J, Als-Nielsen B, Balk EM, Gluud C, Gluud LL, Ioannidis JPA, Schulz KF, Beynon R, Welton NJ, Wood L, Moher D, Deeks JJ, Sterne JAC (2012). Influence of reported study design characteristics on intervention effect estimates from randomized controlled trials. Annals of Internal Medicine 157:429-438.