Threats to external validity

Threats to external validity are any factors within a study that reduce the generalisability (or generality) of the results. Dissertations can suffer from a wide range of potential threats to external validity, which have been discussed extensively in the literature (e.g., Campbell, 1963, 1969; Campbell & Stanley, 1963, 1966; Cook & Campbell, 1979). In this section, four of the main threats to external validity that you may face in your research are discussed with associated examples. These include: (a) selection biases; (b) constructs, methods and confounding; (c) the 'real world' versus the 'experimental world'; and (d) history effects and maturation. In the sections that follow, each of these threats to external validity are explained with accompanying examples.

Selection biases and external validity

Since one of the main goals of dissertations that adopt quantitative research designs is to make generalisations from the sample being studied to (a) the population the sample is drawn from, and (b) in some cases, across populations, selection biases are arguably one of the most significant threats to external validity. Samples consist of units, which can be people, cases (e.g., organisations, institutions), pieces of data, and so forth, but we focus on people in our explanations.

In this section, we (a) explain what selection bias is and the implications that it has for external validity, (b) present the problems that arise from using voluntary participants, which are often required for reasons of research ethics, and (c) highlight the implications of using student samples, common in undergraduate and master's level dissertations. Each of these is discussed in turn:

What is selection bias?

As the saying goes, 'No two people are the same'. They differ along a wide range of factors, such in age, gender, height, intelligence, attitude, behaviour, and so forth. In experimental and quasi-experimental research, you need to make sure that the groups are equivalent before you start or there could be differences between the treatment and control groups (i.e., before any interventions are made), which may explain the differences in scores on the dependent variable. In other words, you need to take into account such individual differences when selecting participants for your research.

When the sample that is studied does not represent the population that the researcher hopes to make generalisations to, there has been a selection bias. Where selection bias occurs, it is difficult (or maybe impossible, depending on the level of selection bias) to argue that the results that come from a biased sample can be generalised to the wider population.

Selection bias can be reduced in experimental research designs because one the fundamental criterion is the random assignment of participants to the different groups that you are comparing. By random assignment, we mean that participants in the different groups that are being compared are similar across a range of general and specific characteristics. Some of the more general characteristics when randomly assigning participants to different groups include factors such as age and gender. However, there may also be specific characteristics that you want to take account of, which will depend on the nature of the research you are performing. By comparison, quasi-experimental research designs do not involve the random assignment of participants to the different groups being compared. As the article, Quasi-experimental research design shows, such a quasi-experimental research design may have been chosen intentionally, or it may not have been possible to randomly assign participants. This may reflect the difficulty in meeting the requirements of a probability sample, such as obtaining a detailed list of the population being studied, which forces you to select a non-probability sample [see the section on Sampling Strategy]; or you may be studying a pre-existing group where it is impossible to separate participants into different groups (e.g., a class of students from one school and a class of students from another school). Therefore, selection bias is likely to be a more significant threat to external validity when you are using a quasi-experimental research design.

At the end of the day, samples are not perfect representations of populations, even when considerable expense and care is taken (i.e., even when using probability sampling techniques and random assignment). As a result, when other researchers try to replicate a study, it is possible that the samples are not similar (e.g., more men than women), such that different results are attained. In such cases, it is important to assess whether the causal relationships or differences found were the result of the treatment or differences in the samples (e.g., gender make-up). However, this is not so much about poor sampling (or more appropriately, unrepresentative sampling), but the fact that extraneous variables, which relate to the characteristics of the sample, have become confounding variables, limiting the generalizability of the results [see the article: Extraneous and confounding variables]. Furthermore, a study is only likely to look at certain characteristics of a population; that is, it will not necessarily look for every difference in the relationships studied (usually between two variables) across sample characteristics (e.g., age, gender, attitudes, personality, etc.). However, it may be differences in these sampling characteristics that limit the generalizability of results to a wider population.

The problem of volunteer bias

When participants engage in research, it is expected that they do so voluntarily. This is an important component of research ethics [see the article: Principles of research ethics]. However, research has shown that volunteers do not have the same characteristics as the general population (e.g., Rosenthal and Rosnow, 1975). People may volunteer to take part in research for specific purposes (e.g., personal reasons), which influence how they respond during the research process, whether the measurement procedure is an interview, focus group, survey, or something else.

How the characteristics of volunteers differ from the general population is likely to depend on the phenomenon you are investigating. For example, males may be more likely to volunteer for research into exercise and weight training, whilst women may be more likely to volunteer for research into retail habits. Whilst these are crude stereotypes, it is important to recognise such differences between volunteers, as well as the difficulty in identifying potential differences. Think about what research you may be willing (and unwilling) to volunteer for, and whether other people you know are similar (or dissimilar) to you.

Whilst it is not expected that the sample you study will be perfectly representative of the population you are interested in, the use of volunteers adds an additional layer of potential bias. This is known as volunteer bias. Since such volunteer bias reduces the homogeneity (i.e., similarity) of the characteristics between your sample and the population you are interested in, this threatens (i.e., reduces) the external validity of your findings; that is, it threatens your ability to make generalisations from your sample to the population you are interested in.

In practice, it is extremely difficult to avoid volunteer bias. However, by asking participants why they volunteered, this may highlight the extent to which volunteer bias could have reduced the external validity of your findings.

The use of student samples

As an undergraduate or master's level dissertation student, it is common to use other university students as the main participants in your research. Whilst this provides for a much more accessible sample, it will inevitably result in selection bias, reducing the ability to make generalisations to a wider population, which is unlikely to be so heavily made up of university students.

Further considerations

Clearly, selection bias, including volunteer bias and the use of student samples, can reduce the extent to which samples are representative of the populations they are drawn from. This reduces the ability to make generalisations from your sample to the wider population. However, the extent to which your findings can be generalised across populations will also depend on the breadth of the characteristics that are included in your sample. For example, when sampling, you may stratify your sample to ensure that there are a representative proportion of males and females (i.e., gender), people of different ages, and so forth. However, you may not have distinguished other characteristics of the population you were studying (e.g., educational level, occupation, etc.). This will limit the extent to which you can generalise your results across populations.

1 2 3 4 5 6