Bracken MB (2008). Why animal studies are often poor predictors of human reactions to exposure.

© Michael B. Bracken, Center for Perinatal, Pediatric and Environmental Epidemiology, Yale University Schools of Medicine and Public Health, 1 Church Street, New Haven, CT 06510, USA. E-mail:

Cite as: Bracken MB (2008). Why animal studies are often poor predictors of human reactions to exposure. JLL Bulletin: Commentaries on the history of treatment evaluation (

The concept that animal research, particularly that relating to pharmaceuticals and environmental agents, may be a poor predictor of human experience is not new. A thousand years ago, Ibn Sina commented on the need to study humans rather than animals (Ibn Sina 11th Century; Nasser et al. 2007), and Alexander Pope’s dictum “The proper study of mankind is man” is well known and has been widely cited (see Gold 1952). Pharmacologists, in particular, have long recognized the difficulties inherent in extrapolating drug data from animals to man (Brodie 1962; Lasagna 1964). Given the large number of animal studies conducted, it would be expected that some animal experiments do predict human reactions. For example, penicillin was observed to protect both mice and humans from staphylococcal infections (Florey and Abraham 1951), and isotretinoin (‘Acutane’) causes birth defects in rabbits and monkeys as well as in humans (although not in mice or rats) (Nau 2001). By contrast, corticosteroids are widely teratogenic in animals but not in humans (Needs and Brooks 1985), and thalidomide is not a teratogen in many animal species but it is in humans (Lepper et al. 2006). Recent experience in a phase 1 study of the monoclonal antibody TGN 1412 resulted in life-threatening morbidity in all six healthy volunteers, reflecting inadequate prediction even in non-human primates of the human response (Stebbings et al. 2007).

One reason why animal experiments often do not translate into replications in human trials (Roberts et al. 2002; Hackham and Redelmeier 2006) or into cancer chemoprevention (Corpet and Pierre 2003; 2005; American Council on Science and Health 2006) is that many animal experiments are poorly designed, conducted and analyzed. Another possible contribution to failure to replicate the results of animal research in humans is that reviews and summaries of evidence from animal research are methodologically inadequate (Mignini and Khan 2006). In one survey, only 1/10,000 Medline records of animal studies were tagged as being meta-analyses compared with 1/1000 human studies (Sandercock and Roberts 2002). In recent reports, the poor quality of research synthesis was documented by a comprehensive search of Medline which found only 25 systematic reviews of animal research (Pound et al. 2004). Other recent studies similarly found only 30 (Mignini and Khan 2006) and 57 (Peters et al. 2006) systematic reviews of any type of animal research. These deficiencies are important because animal research often provides the rationale for hypotheses studied by epidemiologists and clinical researchers.

The paper by Perel and his colleagues (2007) has been added to the James Lind Library because it has made an important methodological contribution to understanding why animal studies may not predict human reactions. The authors conducted systematic reviews of the animal research relevant to studies in humans in six areas of research where confident estimates of intervention effects (benefit or harm) have been demonstrated in systematic review of randomized trials. The interventions studied were: corticosteroids for head injury; antifibrinolytics to reduce bleeding; tissue plasminogen activator to reduce death and disability after stroke; tirilazad for ischaemic stroke; antenatal corticosteroids to reduce lung morbidity and death in preterm newborns; and bisphosphonates to increase bone mineral density. In three of the research areas the animal studies and human trials were substantially discordant; in three others the results were essentially similar. In all areas of research, however, major methodological limitations of the animal research and evidence of widespread publication bias were identified.

Systematic review of animal studies is most advanced in the field of stroke research (Horn et al. 2001; Sena et al. 2007), an area where almost no new human drug therapies have been developed despite decades of research. In one systematic review of FK506, for which 29 animal studies were found, only one study had blinded investigators to the intervention and only two blinded observers during outcome assessment. None of the 29 studies met all 10 quality criteria applied by the reviewers (one study met no criteria; the highest score was 7). Meta-analysis of the studies demonstrated a strong trend for the methodologically weakest studies to show the strongest protective effects, and the methodologically strongest studies to show either no or weak protective effects (Macleod et al. 2005).

The few systematic reviews of the animal literature that have been done also pointed to the poor quality of other animal research, and the difficulty of extrapolating from it to humans (Bebarta et al. 2003), a concern which is being increasingly made in other fields of drug development and evaluation (Kenter and Cohen 2006; Sundstrom 2007). Matthews (2008) has recently challenged those who have claimed that “virtually every medical achievement of the last century has depended directly or indirectly on research with animals” to provide evidence justifying their assertion.

Some of the key problems have been summarized by Pound and her colleagues (2004):

  • Disparate animal species and strains, with a variety of metabolic pathways and drug metabolites, leading to variation in efficacy and toxicity
  • Different models for inducing illness or injury, with varying similarity to the human condition
  • Variations in drug dosing schedules and regimens of uncertain relevance to the human condition
  • Variability in animals for study, methods of randomization, choice of comparison therapy (none, placebo, vehicle)
  • Small experimental groups with inadequate statistical power; simple statistical analyses that do not account for confounding; and failure to follow intention-to-treat principles
  • Nuances in laboratory technique that may influence results, for example, methods for blinding investigators, being neither recognized nor reported
  • Selection of outcome measures, which being surrogates or precursors of disease, of uncertain relevance to the human clinical condition
  • Variable duration of follow up, which may not correspond to disease latency in humans

As Perel and his colleagues (2007) and others referred to in this commentary have shown, animal studies will only become more valid predictors of human reactions to exposures and treatments if there is substantial improvement in both their scientific methods as well as in more systematic review of the animal literature as it evolves. Systematic reviews of animal research, if they are used to inform the design of clinical trials, particularly with respect to appropriate drug dose, timing and other crucial aspects of the drug regimen, will further improve the predictability of animal research in human clinical trials.


I am grateful to Malcolm Macleod for comments on an earlier draft of this commentary.

This James Lind Library commentary has been republished in the Journal of the Royal Society of Medicine 2009;102:120-122. Print PDF


American Council for Science and Health (2006). America’s war on carcinogens: Reassessing the use of animal tests to predict human cancer risk.

Bebarta V, Luyten D, Heard K (2003). Emergency medicine animal research: does use of randomization and blinding affect the results? Academic Emergency Medicine 10:684-7.

Brodie BB (1962). Symposium on clinical drug evaluation and human pharmacology. VI. Difficulties in extrapolating data on metabolism of drugs from animal to man. Clinical Pharmacology and Therapeutics 3:374-80.

Corpet DE, Pierre F (2003). Point: From animal models to prevention of colon cancer. Systematic review of chemoprevention in min mice and choice of the model system. Cancer Epidemiology Biomarkers and Prevention 12:391-400.

Corpet DE, Pierre F (2005). How good are rodent models of carcinogenesis in predicting efficacy in humans? A systematic review and meta-analysis of colon chemoprevention in rats, mice and men. European Journal of Cancer 41:1911-22.

Florey HW, Abraham EP (1951). The work on penicillin at Oxford. Journal of the History of Medicine and Allied Sciences 6:302-17.

Gold H (1952). The proper study of mankind is the man. American Journal of Medicine 12:619-620.

Hackam DG, Redelmeier DA (2006). Translation of research evidence from animals to humans. JAMA 296:1731-2.

Horn J, de Haan RJ, Vermeulen M, Luiten PGM, Limburg M (2001). Nimodipine in animal model experiments of focal cerebral ischaemia: a systematic review. Stroke 32:2433-38.

Ibn Sina (11th century). Kitab Al-Qanun fi al-Tibb (

Nasser M, Tibi A, Savage-Smith E (2007). Ibn Sina’s Canon of Medicine: 11th century rules for assessing the effects of drugs. JLL Bulletin: Commentaries on the history of treatment evaluation (

Kenter MJ, Cohen AF (2006). Establishing risk of human experimentation with drugs: lessons from TGN1412. Lancet 368:1387-91.

Lasagna L (1964). The diseases drugs cause. Perspectives in Biology and Medicine 7:457-70.

Lepper ER, Smith NF, Cox MC, Scripture CD, Figg WD (2006). Thalidomide metabolism and hydrolysis: mechanisms and implications. Current Drug Metabolism 7:677-85.

Macleod MR, O’Collins T, Horky LL, Howells DW, Donnan GA (2005). Systematic review and meta-analysis of the efficacy of FK506 in experimental stroke. Journal of Cerebral Blood Flow and Metabolism 25:713-21.

Matthews RAJ (2008). Medical progress depends on animal models – doesn’t it? Journal of the Royal Society of Medicine 101:95-98. DOI 10.1258/jrsm.2007.070164.

Mignini LE, Khan KS (2006). Methodological quality of systematic reviews of animal studies: a survey of reviews of basic research. BMC Medical Research Methodology 6:10.

Nau H (2001). Teratogenicity of isotretinoin revisited: species variation and the role of all-trans-retinoic acid. Journal of the American Academy of Dermatology 45:S183-7.

Needs CJ, Brooks PM (1985). Antirheumatic medication in pregnancy. British Journal of Rheumatology 24:282-90.

Perel P, Roberts I, Sena E, Wheble P, Briscoe C, Sandercock S, Macleod M, Mignini LE, Jayaram P, Khan KS (2007). Comparison of treatment effects between animal experiments and clinical trials: systematic review. BMJ 334:197.

Peters JL, Sutton AJ, Jones DR, Abrams KR, Rushton L (2006). A systematic review of systematic reviews and meta-analyses of animal experiments with guidelines for reporting. Journal of Environmental Science and Health B 41:1245-58.

Pound P, Ebrahim S, Sandercock P, Bracken MB, Roberts I (2004). Where is the evidence that animal research benefits humans? BMJ 328:514-7.

Roberts I, Kwan I, Evans P, Haig S (2002). Does animal experimentation inform human healthcare? Observations from a systematic review of international animal experiments on fluid resuscitation. BMJ 324:474-6.

Sandercock P, Roberts I (2002). Systematic reviews of animal experiments. Lancet 360:586.

Sena E, van der Worp HB, Howells D, Macleod M (2007). How can we improve the pre-clinical development of drugs for stroke? Trends in Neuroscience 30:433-9.

Stebbings R, Findlay L, Edwards C, Eastwood D, Bird C, North D, Mistry Y, Dilger P, Liefooghe E, Cludts I, Fox B, Tarrant G, Robinson J, Meager T, Dolman C, Thorpe SJ, Bristow A, Wadhwa M, Thorpe R, Poole S (2007). “Cytokine Storm” in the Phase I Trial of Monoclonal Antibody TGN1412: Better understanding the causes to improve preclinical testing of immunotherapeutics. Journal of Immunology 179:3325-31.

Sundstrom L (2007). Thinking inside the box. To cope with an increasing disease burden, drug discovery needs biologically relevant and predictive testing systems. EMBO Reports:S40-43.