Gold Dots of Dark Background
AAJ Holiday Schedule:

Please note that AAJ's office will be closed starting on December 24th through January 2, 2025.  Happy Holidays!

Vol. 55 No. 4

Trial Magazine

Theme Article

You must be an AAJ member to access this content.

If you are an active AAJ member or have a Trial Magazine subscription, simply login to view this content.
Not an AAJ member? Join today!

Join AAJ

Decoding General Causation Data

Often a hard-fought aspect of pharmaceutical cases, understanding how to interpret data on injuries to exposed populations and prove general causation is crucial.

Tara Tabatabaie April 2019

In every pharmaceutical case, the plaintiff has the burden of proving by a preponderance of evidence that the drug in question can cause the claimed injury (general causation) and that the drug caused this particular plaintiff’s injury (specific causation). Specific causation, while crucial, is normally a minor part of the causation fight. Proving general causation, however, requires developing expert opinions in numerous scientific and medical areas, and defendants’ summary judgment motions often incorporate Daubert challenges to plaintiffs’ experts’ opinions.1


Proving general causation typically relies on three kinds of evidence: human data, animal studies, and mechanistic data.


Proving general causation typically relies on three kinds of evidence: human data, animal studies, and mechanistic data.2 Named after Sir Austin ­Bradford Hill, the British epidemiologist who played a pivotal role in linking smoking to lung cancer, “Hill” factors often are invoked when evaluating causation evidence.3 These factors include: strength of association, consistency, specificity, temporality, biological gradient, plausibility, coherence, experiment, and analogy.4

Defendants cite Hill factors to challenge plaintiffs’ causation theories.5 They almost invariably try to convince the judge that the Hill factors are rigid criteria and that every factor must be met for an association to rise to the level of causation. Neither of these claims is accurate.

Even though many in the epidemiology field have carelessly referred to the Hill factors as “criteria,”6 Hill never used that term and did not intend the factors to be used as a checklist for determining causation.7 In his 1965 address to the Royal Society of Medicine, where he first articulated the factors, Hill simply called them nine “different viewpoints” that, when applicable, would assist in determining causation.2

Hill did not believe that all nine factors were necessary for establishing causation.8 The factors “are not absolute nor does inference of a causal relationship require that all criteria be met. In fact, only temporality [or whether the exposure preceded the injury] is requisite.”9 Defense attempts to question causation opinions that do not satisfy all nine factors contradict Hill’s admonition.10 Courts that insist on such checklists tend to blur the line between admissibility (a legal issue for the court) and whether the expert testimony was sufficient to prove the plaintiff’s case (a factual issue for the fact-finder).11 “Rather than exclude studies that fail to meet these court-imposed thresholds, courts should admit and allow the adversary process to inform the jury about the strengths and weaknesses of these studies.”12

Challenges to Epidemiological Data

Epidemiological studies can support a causal association between exposure and injury by revealing an increased incidence of the injury in an exposed population when compared to that of an unexposed group. Is epidemiological data necessary for proving causation? The simple answer is no. “Hill focused on interpreting epidemiological evidence but never claimed that it must be present to warrant a high degree of willingness to certify causality.”13

Supportive epidemiological data goes a long way in enhancing ­plaintiffs’ general causation arguments, but its absence should not be seen as a death knell. The Reference Guide on ­Epidemiology ­recognizes that “most courts have ­appropriately declined to impose a threshold requirement that a ­plaintiff always must prove causation with epidemiologic evidence.”14 Several courts have found robust data from animal and mechanistic studies sufficient to ­overcome Daubert challenges that are based on the lack of epidemiological data.15

And be aware that when plaintiffs submit ­epidemiological data in support of causation, defendants may use a few ­Hill-related angles of attack to exclude it, such as the lack of statistical significance, an alleged necessity for doubling of risk, and the lack of a linear ­dose-response effect.

Statistical significance based on p-value. Defendants may argue that only studies with a statistical significance (p-value) of less than 0.05 should be considered in a causation analysis. While some courts have rejected this argument, it has unfortunately been accepted by others.16 The p-value gives the probability of a false positive—the higher the p-value, the higher the probability of a false positive. A lower p-value supports the conclusion that an observed association is not likely to be due solely to chance. Some courts have incorrectly held that studies with p-values greater than 0.05 should be excluded as unreliable.17 Many epidemiologists have widely criticized this approach.18 Hill also stressed, “No formal tests of significance can answer those [causation] questions.”19

Established cases of causal relationship exist with epidemiological data that have p-values greater than 0.05. For example, ionizing radiation (radiation caused by an atomic bomb) is a recognized cause of cancer. However, epidemiological evidence does not reach the statistical significance of 0.05. And in its study of the effects of secondhand cigarette smoke, the EPA found the totality of the data supporting a causal relationship convincing, even though the epidemiological studies it relied on had p-values exceeding 0.05.20  While 0.05 is the significance level conventionally selected, other levels can be and have been used.21


When reviewing p-values, some courts fail to recognize that statistical significance is not a one-size-fits-all concept.


When reviewing p-values, some courts fail to recognize that statistical significance is not a one-size-fits-all concept. “The level of statistical significance chosen is a tradeoff between false positives and false negatives”—the lower the chosen level of statistical significance, the higher the probability of obtaining a false negative result.22 

What routinely gets lost in the discussion is that statistical significance depends on the study power. “A study may fail to find a statistically significant association not because one doesn’t exist, but because the study was insufficiently powered to find an association of that magnitude.”23 The study power depends on both the size of the study and on the size of the studied effect.

Larger studies may reliably detect a cause and effect relationship even with p-values greater than 0.05, while smaller studies can easily fail to detect a true cause and effect relationship. For instance, a clinical trial comparing the efficacy of a new treatment and a conventional one with 300 subjects in each arm of the study may fail to detect the risk of a serious but rarely occurring side effect at the required statistical significance simply because the number of subjects in the study and the incidence of the injury are insufficient to allow detection.

Statistical significance based on confidence intervals. The same considerations that apply to the interpretation of a p-value also challenge a rigid approach to statistical significance presented as a confidence interval. A confidence interval is a range of values calculated from the results of a study that likely contains the true value of what the study aims to measure, such as relative risk. The probability that the confidence interval encompasses the true value is called the confidence level. The most commonly used confidence level value is 95 percent.

The width of the confidence interval is a parameter that must be considered. “For a given risk estimate, a narrower confidence interval reflects a decreased likelihood that the association found in the study would occur by chance.”24 An automatic rejection of any confidence interval that includes 1.0 (which signifies no increased risk of injury) could lead to interpreting a study as negative or inconclusive when a true association exists.25

Use these points to offensively challenge a drug manufacturer’s clinical trial results when it hides behind studies that do not show any increase in the rate of the injury. Those studies are mainly designed to prove the efficacy of the drug. Almost invariably, they do not have sufficient power to detect the injury because they are either too small or the injury occurs at a low rate.26

Scrutinizing the design and conduct of clinical trials can help provide crucial evidence for your causation case. For example, in litigation involving Avandia (a diabetes drug found to cause heart attacks in patients), drilling down on GlaxoSmithKline’s (GSK) clinical trial methodology showed that the use of lipid-lowering drugs in patients in the Avandia arm of most of those trials was much higher than the control arms where patients were treated with other diabetes drugs. GSK documents revealed that Avandia increased the risk of heart attacks by increasing the level of a particular lipid molecule in the bloodstream.

By zeroing in on the clinical trial details, we discovered that patients treated with Avandia were given lipid-lowering drugs at a much higher rate to obscure Avandia’s effect on the cardiovascular system. This finding enabled our experts to counter the defendant’s claim that its clinical trial results supported the conclusion that Avandia did not cause an increased risk of cardiovascular disease.

Unfortunately, none or few of such details end up in the articles publishing the results of such trials or even in the main body of reports submitted to the FDA. A careful review of the raw data behind the reports can uncover hidden gems that disprove defendants’ less than truthful claims. 

Is a doubling of relative risk required? Relative risk is a measure of the risk in the exposed population attributable to the agent under investigation. It is defined as the ratio of the incidence rate of a disease in exposed individuals to the incidence rate in unexposed individuals.

Just like statistical significance, relative risk is a statistical test that depends on the size of the population studied.27 For a study to detect a relative risk of 2.0—a 200 percent increase in the rate of injury in the exposed population—with a statistical significance of less than 0.05 at the required power, it must have a minimum sample size. The investigator determines the required sample size using statistical formulas derived for that purpose. A study with a sample size that is too small would not be able to accomplish that.

But small studies still provide valuable information about the risk of the injury in the exposed group and should be considered in the context of all other available evidence. For instance, in a drug case, pooling the results from multiple small clinical trials (a meta-analysis) increases the statistical power of the analysis and can help detect the risk of an injury by reducing the likelihood of the false negative error associated with the smaller studies.

Defendants have convinced some courts that a doubling of the relative risk—that the risk of the exposed population must be twice that of the unexposed population—is necessary for meeting the “more probable than not” evidentiary standard.28 Equating legal and scientific standards, however, has been criticized as logically unsound and analogous to mixing “apples and oranges.”29 “As long as there is a relative risk greater than 1.0, there is some association, and experts should be permitted to base their causal explanations on such studies.”30

Examples of causal relationships with relative risks of less than 2.0 have been well-established. For instance, secondhand smoking is an accepted cause of lung cancer even though studies have shown a relative risk of only 1.20–1.30.31 The same is true for the recognized relationship between smoking and cardiovascular disease or the risk of breast cancer due to treatment with estrogen-progestogen.32

This concept also has been recognized by The Restatement (Third) of Torts, which notes that a judicial requirement that plaintiffs show a threshold increase in risk (or a doubling of risk) to satisfy the burden of proof of causation is “usually inappropriate.”33

The dose-response question. Defendants also attack the lack of a linear dose-response effect (when the risk of injury increases with dosage) in epidemiological data. While the presence of a linear dose-response effect strengthens a causal analysis, the lack of a linear effect should not be taken as evidence of a lack of causation.34

Some established causal relationships do not show apparent ­dose-response relationships.35 For example, mortality due to alcohol abuse has a J-shaped dose-response curve: “Death rates are higher among nondrinkers than among moderate drinkers, but ascend to the highest levels for heavy drinkers.”36 “In fact, most dose-response curves are non-linear and can even vary in shape from one study to the next depending on unique characteristics of the given population, exposure routes, and molecular endpoints assessed.”37 The Hill idea of a linear dose-response curve is based mostly on an older, less sophisticated understanding of injury mechanisms and may not apply to every case.

The Role of Toxicology

Toxicological studies, including animal studies and mechanistic data, are the other pillar of the causation analysis. They can be used to support and supplement epidemiological data and are also critical for showing the “biological plausibility” of the injury in question, another one of the Hill factors. “[U]nderstanding the mechanism guides our generalization of a tightly controlled study to a wider population.”38

Animal studies can support causation by showing the injury in exposed animals or helping determine the injury mechanism. In vitro studies performed in isolated cells are not only the primary tool for determining biological mechanisms but can further support causation when cells exposed to the agent in question demonstrate biochemical markers of the injury.

Defendants often challenge the relevance of animal studies and in vitro data to humans. This misleading position begs the question that if the results of animal studies bear no relevance to questions of human health and disease, why do industry, academia, and the government spend millions of dollars each year studying various drugs and agents in animals and in vitro systems? Experiments show “‘that there are more physiologic, biochemical, and metabolic similarities between laboratory animals and humans than there are differences. These similarities increase the probability that results observed in a laboratory setting will predict similar results for humans.’”39

Since Hill named his causation factors in 1965, science has made great strides in determining the cellular and molecular causes of disease. Animal studies are now considered an important tool in understanding potential causal relationships and biological mechanisms in humans.40 That is why, for instance, the International Agency for Research on Cancer (IARC) includes the results of animal studies in all of its carcinogenicity evaluations.41

Scientists now consider data from molecular and cellular experimentation to bolster epidemiological information and lessen the need for repetition of epidemiological studies.42 So, when available, this category of data can serve as more than mere support for Hill’s biological plausibility factor. An up-to-date expert in the area of interest can help make that data an integral part of the causation puzzle.43

In Kennedy v. Collagen Corp., for instance, the plaintiff developed an autoimmune disease from the defendant’s injected collagen products. No epidemiological or animal studies supported the plaintiff’s causation theory. However, the court found that the plaintiff expert’s extensive knowledge of the connection between collagen and various ­autoimmune disorders, combined with his observation of the plaintiff’s medical history and laboratory tests (showing the presence of markers of autoimmunity) were sufficient to render his causation opinion admissible.44

Some plaintiff attorneys tend to discount the need to search for a biological mechanism for the injury. That approach can hurt your case. The defense can use the lack of a plausible biological mechanism to cast doubt on the validity of other causation evidence. Also, a biologically plausible mechanism for the injury “can lend ‘credence to an inference of causality.’”45 Coming up with a plausible mechanism of injury can sometimes make the critical difference for the fact-finder.

Is Bradford Hill the Last Word?

While the defense may try to cast doubt on any causation opinion not based on the Hill factors, this methodology is only one way of reaching a causation opinion. With scientific advances in the past five decades, many in the scientific community believe that, while useful as a starting point, the Hill method “does not fully reflect the current, more clearly articulated [scientific] view of causal processes.”46 Regulatory and scientific organizations such as the EPA, IARC, and the National Toxicology Program use a “weight of evidence” methodology in their causation analyses.47 While defendants argue it is subjective, several courts have correctly approved of causation experts’ use of this methodology.48

Ultimately, some courts take a rigid approach to the reliability of causation experts’ opinions by demanding certain artificial criteria be met in every case. However, this approach fails to consider that scientists do not reach conclusions this way. Science is, by definition, a fluid field where novelty and innovation are the names of the game. Efforts to educate judges hopefully will enable more plaintiffs to successfully get past the Daubert barrier and present their causation evidence to the ultimate arbiters of credibility: the jurors.


Tara Tabatabaie is an attorney at Fulmer Sill in Oklahoma City and has a Ph.D. in chemistry. She can be reached at ttabatabaie@fulmersill.com.


Notes

  1. See, e.g., In re Zoloft (Sertralinehydrochloride) Prods. Liab. Litig., 176 F. Supp. 3d 483, 500 (E.D. Pa. 2016), aff’d, 858 F.3d 787 (3d Cir. 2017).
  2. Human data refers to human evidence linking an exposure to the injury. These include epidemiological studies, case reports in medical literature, and reports of adverse events to regulatory agencies. Animal studies normally consist of safety studies, such as carcinogenicity studies in rodents, as well as studies performed in animals aimed at elucidating mechanisms of injury. Mechanistic studies aim to determine the biological mechanism of an injury and are carried out mainly in isolated cells. 
  3. Austin Bradford Hill, The Environment and Disease: Association or Causation? 58 Proceedings Royal Soc’y Med. 295 (1965).
  4. Id. at 295–99. For more on Hill’s factors, see R. Jason Richards, Reflecting on Hill’s Original Causation Factors, Trial 44 (Nov. 2016).
  5. See, e.g., David Schwartz, 5 Reasons to Apply the Bradford Hill Criteria in Your Next Case, Innovative Sci. Solutions, LLC (Sept. 20, 2013), www.innovativescience.net/blog/ 5-reasons-to-apply-the-bradford-hill-criteria-in-your-next-case/.
  6. See, e.g., Carl V. Phillips & Karen J. Goodman, The Missed Lessons of Sir Austin Bradford Hill, 1 Epidemiologic Perspectives & Innovations 3 (2004). 
  7. For a comprehensive discussion, see Raymond Richard Neutra et al., The Use and Misuse of Bradford Hill in U.S. Tort Law, 58 Jurimetrics J. 127 (2018). For a supporting analysis by a court, see Milward v. Acuity Specialty Prods. Grp., Inc., 639 F.3d 11, 17–19 (1st Cir. 2011).
  8. Jeremy Howick et al., The Evolution of Evidence Hierarchies: What Can Bradford Hill’s ‘Guidelines for Causation’ Contribute?, 102 J. Royal Soc’y Med. 186, 187 (2009).  
  9. Thomas A. Glass et al., Causal Inference in Public Health, 34 Ann. Rev. Pub. Health 61, 65 (2013).
  10. Howick et al., supra note 8, at 187.
  11. Steve C. Gold et al., Instructors’ Guide for Scientific Evidence of Factual Causation: An Educational Module 196–97 (2016).
  12. Erica Beecher-Monas, Lost in Translation: Statistical Inference in Court, 46 Ariz. St. L.J. 1057, 1070–71 (2014).
  13. Neutra et al., supra note 7, at 145. 
  14. Michael D. Green et al., Reference Guide on Epidemiology, in Reference Manual on Scientific Evidence 549, 610 n.183 (3d ed. 2011); see also Neutra et al., supra note 7.
  15. See, e.g., Milward, 639 F.3d at 24–25; Benedi v. McNeil-P.P.C., Inc., 66 F.3d 1378, 1384 (4th Cir. 1995); Kennedy v. Collagen Corp., 161 F.3d 1226, 1229–30 (9th Cir. 1998); Rider v. Sandoz Pharm. Corp., 295 F.3d 1194, 1198 (11th Cir. 2002); In re Phenylpropanolamine (PPA) Prods. Liab. Litig., 289 F. Supp. 2d 1230, 1244 (W.D. Wash. 2003); Restatement (Third) of Torts: Liab. for Physical & Emotional Harm §28 reptr. n. cmt. c(3) (2010); see also Glastetter v. Novartis Pharm. Corp., 252 F.3d 986, 992 (8th Cir. 2001) (per curiam).
  16. See, e.g., In re Lipitor (Atorvastatin Calcium) Mktg., Sales Practices & Prods. Liab. Litig., 174 F. Supp. 3d 911, 924–25 (D.S.C. 2016) (requiring a study result to be statistically significant before Hill factors could be applied to it); see also In re Zoloft (Sertraline Hydrochloride) Prods. Liab. Litig., 26 F. Supp. 3d 449 (E.D. Pa. 2014). But see Milward, 639 F.3d at 20–22 (the district court erred in imposing a statistical significance threshold).
  17. See, e.g., In re Lipitor, 174 F. Supp. 3d at 924–25.
  18. See, e.g., Sander Greenland et al., Statistical Test, P Values, Confidence Intervals, and Power: A Guide to Misinterpretations, 31 European J. Epidemiology 337 (2016); see also Phillips & Goodman, supra note 6; Kenneth J. Rothman, A Show of Confidence, 299 N. Engl. J. Med. 1362 (1978).
  19. Hill, supra note 3, at 299.
  20. Steve C. Gold et al., Scientific Evidence of Factual Causation: An Educational Module 39 (2016). 
  21. Neutra et al., supra note 7, at 144–45.
  22. Beecher-Monas, supra note 12, at 1065.
  23. Gold et al., supra note 11, at 43. 
  24. Id. at 42.
  25. See id. at 42–43.
  26. See, e.g., Ambrosini v. Labarraque, 101 F.3d 129, 136 (D.C. Cir. 1996) (the negative studies cited by the plaintiff needed to have “at least an 80 to 90 percent chance of detecting a causal link” and were otherwise inconclusive).
  27. See Sander Greenland, Relation of Probability of Causation to Relative Risk and Doubling Dose: A Methodologic Error That Has Become a Social Problem, 89 Am. J. Pub. Health 1166 (1999).
  28. See, e.g., Merck & Co., Inc. v. Garza, 347 S.W.3d 256, 265–66 (Tex. 2011) (when a plaintiff seeks to use epidemiologic evidence to prove causation, doubling of the risk is a “threshold requirement” for admissible expert testimony).
  29. Beecher-Monas, supra note 12, at 1067.
  30. Id. at 1067–68.
  31. Neutra et al., supra note 7, at 154–55; U.S. Dep’t Health & Human Servs., The Health Consequences of Involuntary Exposure to Tobacco Smoke: A Report of the Surgeon General 435 (2006).
  32. Int’l Agency for Research on Cancer, 91 IARC Monographs on the Evaluation of Carcinogenic Risks to Humans: Combined Estrogen−Progestogen Contraceptives and Combined Estrogen−Progestogen Menopausal Therapy 53–55 (2007); see also Kenneth J. Rothman & Sander Greenland, Causation and Causal Inference in Epidemiology, 95 Am. J. Pub. Health S144 (2005).
  33. Restatement §28(a), supra note 15, cmt. (a)c(4).
  34. See Gold et al., supra note 11, at 87.
  35. Michael Höfler, The Bradford Hill Considerations on Causality: A Counterfactual Perspective, 2 Emerging Themes Epidemiology 11, 15 (2005). 
  36. Rothman & Greenland, supra note 32, at S149.
  37. Kristen M. Fedak et al., Applying the Bradford Hill Criteria in the 21st Century: How Data Integration Has Changed Causal Inference in Molecular Epidemiology, 12 Emerging Themes Epidemiology 14, 17 (2015).
  38. Howick et al., supra note 8, at 189.
  39. Robert R. Maronpot et al., Relevance of Animal Carcinogenesis Findings to Human Cancer Predictions and Prevention, 32 Toxicologic Pathology 40, 41 (2004) (quoting David P. Rall et al., Alternatives to Using Human Experience in Assessing Health Risks, 8 Ann. Rev. Pub. Health 355, 356 (1987)).
  40. See Bernard D. Goldstein & Mary Sue Henifin, Reference Guide on Toxicology, in Reference Manual on Scientific Evidence 633, 636 (3d ed. 2011).
  41. Int’l Agency for Research on Cancer, IARC Monographs on the Evaluation of Carcinogenic Risks to Humans: Preamble 12–15 (2006).
  42. Fedak et al., supra note 37, at 16.
  43. See, e.g., Wells v. Ortho Pharm. Corp., 788 F.2d 741, 745 (11th Cir. 1986) (quoting Ferebee v. Chevron Chem. Co., 736 F.2d 1529, 1535 (D.C. Cir.), cert. denied, 469 U.S. 1062 (1984)); see also Kennedy, 161 F.3d at 1229–30.
  44. 161 F.3d at 1229–30.
  45. Neutra et al., supra note 7, at 157 (quoting Green et al., supra note 14, at 604).
  46. Glass et al., supra note 9, at 62.
  47. Goldstein & Henifin, supra note 40, at 655, 660 n.75.
  48. See, e.g., Harris v. CSX Transp., Inc., 753 S.E.2d 275, 306 (W. Va. 2013) (weight-of-the-evidence methodology is “recognized and highly respected in the scientific community”); see also Milward, 639 F.3d at 17–19 (endorsing the expert’s use of the weight-of-the-evidence analysis).