Walter A. Kukull, PhD and also Mary Ganguli, MD, MPH
From the department of public health (W.A.K.), university of Washington college of publicly Health, Seattle; and also Departments of Psychiatry, Neurology, and also Epidemiology (M.G.), university of Pittsburgh school of Medicine and Graduate college of public Health, Pittsburgh, PA.
You are watching: What is a limitation that affects the generalizability of research results
Clinical and also epidemiologic investigations space paying enhancing attention to the critical constructs that “representativeness” of examine samples and also “generalizability” of research results. This is a laudable trend and also yet, these crucial concepts are regularly misconstrued and conflated, masking the central issues of internal and external validity. The authors specify these issues and demonstrate how they are related to one another and to generalizability. Giving examples, they identify threats come validity from various forms the bias and also confounding. They likewise lay out pertinent practical worries in research design, native sample choice to evaluate of exposures, in both clinic-based and also population-based settings.
Only to the level we are able to define empirical facts can we attain the significant objective of clinical research, specific not simply to document the phenomena of our experience, however to discover from them, by basing upon castle theoretical generalizations which allow us come anticipate new occurrences and also to control, at the very least to part extent, the changes in our environment.1(p12)
“This research sample is not representative that the population!” “Our results are not generalizable …” such comments are progressively familiar however what exactly do castle mean? how do examine design, subject ascertainment, and “representativeness” the a sample influence “generalizability” that results? execute study outcomes generalize just from statistically drawn samples that a usual underlying population? has “lack of generalizability” become the low-hanging fruit, ripe for plucking by the casual critic?
INTERNAL and EXTERNAL VALIDITY
Confusion around generalizability has occurred from the conflation of 2 an essential questions. First, space the results of the research true, or room they one artifact that the means the examine was designed or conducted; i.e., is the study is internally valid? Second, are the examine results likely to apply, usually or specifics in other study settings or samples; i.e., room the study results externally valid?
Thoughtful research design, careful data collection, and also appropriate statistical analysis are in ~ the main point of any type of study"s inner validity. Whether or not those internally valid outcomes will then extensively “generalize,” to other study settings, samples, or populations, is as much a matter of judgment as of statistical inference. The generalizability the a study"s results depends on the researcher"s capacity to separate the “relevant” from the “irrelevant” facts of the study, and then lug forward a judgment around the appropriate facts,2 which would be easy if we always knew what could eventually revolve out to be relevant. After ~ all, us generalize results from animal studies to humans, if the typical biologic procedure or condition mechanism is “relevant” and varieties is fairly “irrelevant.” We additionally draw broad inferences indigenous randomized managed trials, even though this studies regularly have details inclusion and exclusion criteria, fairly than being population probability samples. In other words, generalization is the “big picture” translate of a study"s results as soon as they are identified to be internally valid.
SAMPLING and also REPRESENTATIVENESS
The statistical principles of sampling theory and hypothesis testing have end up being intermingled with the id of generalizability. Strict estimation that quantities based on a probability sample the a “population,” vs assessing every members of that population, remained an object of considerable argument amongst statisticians until the beforehand 20th century.3 Sampling was embraced of necessity since studying the entire population was not feasible. Same samples must carry out valid estimates of the populace characteristics gift studied. This fairly reasonable ide evolved in common usage so the “population” became synonymous with “all persons or all cases.” It adhered to that to attain representative and also generalizable sample estimates, a probability sample that “all” have to be drawn. Logically, then, “all” should somehow it is in enumerated prior to representative samples can be drawn. The bite that the vicious circle becomes evident when “all” literally method all in a country or continent. Yet enumeration might be achievable when care is required to establish more finite population boundaries.
Statisticians Kruskal and Mosteller3–6 carried out a in-depth examination the nonscientific, “extrastatistical scientific,” and also statistical literature to classify uses of the hatchet representative sample or sampling. Those meanings are 1) “general, unjustified acclaim because that the data”; 2) “absence (or presence) that selective forces”; 3) “mirror or miniature the the population”; 4) “typical or ideal case … that represents that (the population) on average”; 5) “coverage of the populace … (sample) include at least one article from each stratum …”; 6) “a vague ax to be made precise” by specification that a details statistical sampling scheme, e.g., straightforward random sampling. In statistics literature, representative sampling interpretations include a) “a details sampling method”; b) “permitting great estimation”; and c) “good sufficient for a details purpose.”4 The conflicts and ambiguities among the above uses room obvious, but how perform we look for clarity in our research study discourse?
POPULATIONS, CLINICS, and BOUNDARIES
So is over there in fact any type of value to population-based research studies (Indeed there is!), and also if so, exactly how should we specify a “population”? We first define the by establishing its boundaries (e.g., counties, insurance memberships, schools, voter registration lists). The populace is consisted of entirely the members with condition (cases) and members without an illness (noncases), leaving nobody out. Ideally, we would certainly capture and also study all cases, together they occur. As a compare group, we would additionally include either every noncases, or a probability sample the noncases.7 The an option of “boundaries” for a study population influences internal and also external validity. If us deliberately or inadvertently “gerrymander” our boundaries, so that the factor of attention is an ext (or less) common among cases than among noncases, the study base will be biased and our results will be spurious or misleading.
Adequately draft population-based researches minimize the opportunity that selection factors will have actually unintended adverse after-effects on the study results. Further, since any effect we can measure depends as lot on the comparison group as that does on the situation group, appropriate choice is no less necessary for the noncases 보다 it is because that cases. This is true even if it is the research is clinic-based or population-based. Population-based research anchors the comparison team to the cases.
Clinic-based investigations space exemplified by those conducted at Alzheimer"s condition Research Centers (ADRCs). They typically examine high-risk, family-based, clinic-based, or hospital-based groups, to watch association v treatment or disease. This is an efficient method to facilitate comprehensive study of “clean” diagnostic subgroups. The external validity the these studies rests ~ above the judgment of whether the subject selection procedure itself might have spuriously influenced the results. This determination is frequently harder in clinic-based researches than in population-based studies. Replication in an elevation sample is as such key, yet replication is an ext elusive and complicated with clinic-based studies, together we talk about later.
Regardless of even if it is the study sample is clinic-based or population-based, how well and fully we determine “disease” (including preclinical or asymptomatic disease), not just in our situation group, but likewise among those in our comparison group, deserve to adversely affect results. For example, consider a examine of Alzheimer disease (AD) in which, unbeknownst come the subjects as well as the investigators, the cognitively normal regulate group includes a huge proportion the persons with underlying advertisement pathology. The result diagnostic misclassification, caused by including true “cases” among the noncases, would certainly spuriously distort and also weaken the observed results. This distortion can happen in clinic-based or population-based studies; that is a matter of interior validity tied to diagnostic accuracy, rather than an worry of representativeness or generalizability.
Bias causes observed measurements or outcomes to differ from their true values due to the fact that of systematic, however unintended, “errors,” because that example, in the way we ascertain and also enroll study subjects (selection bias), or the means we collect data from them (information bias). Statistical meaning of study results, regardless of p value, is fully irrelevant together a way of evaluating results when bias is active.
Selection prejudice is regularly subtle, and requires careful thought to discern the potential effect on the hypotheses being tested. Because that example, would an option bias render clinic-based ADRC study outcomes suspect, if no invalid? Unfortunately, the prize is not simple; it counts on what is gift studied and also whether “selection” into the ADRC research distorts the true association. There space numerous advantages to recruiting research participants from committed memory disorder clinics, together in the usual ADRC. Both ad cases and also healthy controls are selected (as volunteers or referrals) under very specific circumstances that ensure their contribution to ad research. Castle either have actually (cases) or perform not have (controls) the clinical/pathologic features usual of AD. Cases fulfill the research diagnostic criteria because that AD, they have “reliable informants” that will accompany them to clinic visits; neither instances nor controls deserve to have miscellaneous exclusionary features (e.g., comorbid hit or significant psychiatric disorder); all are encouraged to concerned the clinic and also participate totally in the research, including neuroimaging and lumbar puncture; plenty of are passionate to go into clinical trials, and also many consent to eventual autopsy. Advertisement cases who fit the over profile room admirable for their enthusiasm and also altruism, yet may no be typical, nor a probability sample of all ad cases in the population base indigenous whence lock came. The differential distribution of study factors between ad cases who did and also did no enroll could give us an clues of whether bias may it is in attenuating or exaggerating the certain study results, if us were may be to acquire that information. Therefore, the astute reader asks: “Can the underlying population base, indigenous which the subjects came, be described? can the populace base"s established boundaries or inclusion qualities have affected the results? Was topic enrollment in any way influenced by the components being studied?” In a clinic-based research it is hardly ever easy to define the unenrolled situations (or unenrolled noncases) indigenous the underlying population base in bespeak to make such comparisons. It helps inner validity very little to insurance claim that the enrollees" age, race, and also sex distributions space in comparable proportions come the population of the bordering county, if age, race, and also sex have tiny to do with the element being studied, and also if authorized is differentially connected with the determinants being studied.
Note the population-based studies space not inherently safeguarded from bias; individuals sampled native the community, who are not seek services, might consent or refuse to get involved in research, and their willingness to get involved is unlikely to be random. If us were involved about choice bias in a study assessing pesticide exposure as a risk variable for Parkinson an illness (PD), we could ask, “Were PD situations who had not been exposed come pesticides much more (or less) likely to refuse enrollment in our examine than PD cases who had actually been exposed?”
Selection predisposition may be not just inadvertent but additionally unavoidable. Some years ago, a startle finding8 was reported that advertisement cases who volunteered or were described an ADRC were significantly an ext likely to carry the APOE*4 genotype than were newly recognized advertisement cases recorded through monitoring of a health and wellness maintenance organization populace base in ~ the very same metropolitan area. The ADRC sample had actually yielded a biased evaluate of APOE*4 allele frequency, and also of its approximated relative risk, due to the fact that ADRC cases were inadvertently selected on the basis of age, and it was unnoticed the the likelihood of transferring an APOE*4 allele decreases with age. There is no method the ADRC investigators might have detect this inadvertent an option bias had they not likewise had access to a populace sample indigenous the very same base. A later on meta-analysis of APOE*4 allele results quantified the relationship between age and risk of advertisement associated v APOE alleles, and showed that ad risk as result of APOE*4 genotype is lower in populace samples 보다 in specialty clinic samples.9APOE allele frequency additionally could be influenced by study recruitment. Family background of advertisement seems to encourage participation in both clinical and also population-based studies including memory loss, and also is likewise associated with APOE*4 frequency, thereby potentially biasing the magnitude of APOE effect.
Survival prejudice is a kind of selection bias that is past the control of the selector. Because that example, part African populations have high APOE*4 frequency yet have not presented an elevated association in between APOE*4 and also AD.10,11 when there might be multiple factors for this paradox, one possibility is that people with the APOE*4 genotype had died of heart condition before growing old enough to develop dementia.
See more: Freeform 25 Days Of Christmas Schedule 2016, Freeform Releases 25 Days Of Christmas Schedule
Prevalence bias (length bias) is comparable to survive bias. In the 1990s, numerous case-control studies proved a protective result of smoking cigarettes on advertisement occurrence.12 Assume the both ad and smoking shorten life expectancy and also that advertisement cases enrolled in those researches some time after ~ symptom onset. If age alone was the basis because that potential choice bias, smoking should reason premature mortality equally amongst those who are and those who room not destined to construct AD. However, over there is another aspect of selection bias dubbed prevalence or size bias: at any given time, prevalent, i.e., existing, cases are those whose survive with disease (disease duration) was of better length. If smokers with ad die sooner after advertisement onset than nonsmokers v AD, those prevalent ad cases available for research would “selectively” it is in nonsmokers. A scenario well-known as “competing risks” occurs as soon as smoking impacts the threat both that death and of AD.13 This would improve the it was observed excess the smoking among “controls” and thereby inflate the obvious protective association in between smoking and also AD. Subsequently, longitudinal researches of smokers and nonsmokers confirmed an boosted risk of advertisement incidence linked with smoking,12 arguing that choice bias might have described the previously cross-sectional research results.