This leads us to two of the core aspects of content validity; that is, the fact that content validity is the extent to which the elements within a measurement procedure are relevant and representative of the construct that they will be used to measure. Each of these is discussed in turn:
Relevance simply means that you have to make sure that the elements within your measurement procedure reflect the construct you are interested in studying. We can think about relevance in terms of: (a) the purpose of the study; (b) the application of theory and judgement; and (c) the appropriateness of the elements included. Each of these is discussed in turn:
The purpose of the study
In well-constructed quantitative research, it is important to distinguish between ideas that we want to discuss at the conceptual level (e.g., the concept of poverty) and constructs that we want to measure (e.g., the construct of economic poverty).
For example, we may be able to talk about poverty at a broader conceptual level, but to try and measure a concept as broad as poverty would not be possible (i.e., it is far too broad and ambitious, if possible at all). Therefore, we choose to focus in on a specific aspect of the concept, poverty, which is the construct, economic poverty.
The content of a measurement procedure will only be relevant if it focuses in on the specific construct you want to measure, rather than any broader concept that may be of interest.
The appropriateness of the elements included
To ensure that the content of a measurement procedure is relevant, there also needs to be a close fit between the elements included in your measurement procedure and the specific construct that you are trying to study.
For example, imagine that we have operationally defined the construct, economic poverty, using an operational definition that looks at its intrinsic properties [NOTE: remember that there are multiple ways that a construct can be operationally defined]. By focusing on these intrinsic properties, we are interested in elements of economic poverty that include measures such as how much money a person has in the bank, how indebted they are, and so forth.
However, elements within measurement procedures do not only include measures (e.g., questions like: "How much money do you have in your bank account?"). Elements also refer to the way that such measures are coded.
In terms of economic poverty, take the measure: "How much money do you have in your bank account?" We could choose to code this as a continuous variable (i.e., the person enters the exact amount of money) or a nominal variable (i.e., the person ticks a box that corresponds to specific amounts we have pre-determined; perhaps $1-100, $101-250, $251-1000, and so forth).
When deciding which elements to include within a measurement procedure (i.e., the specific measures and coding), you need to think about how appropriate (i.e., relevant) these elements are to the operational definition you have chosen. This has to take place for a measurement procedure to be considered content valid.
The application of theory and judgement
The relevance of the elements that you choose to include within a measurement procedure also depends on theory and judgement.
Theory is important because we need to be able to justify why certain elements are included whilst others are not. Often, such justifications come from the academic literature, which helps to explain what elements should be considered for a given construct. In addition, previous studies may have also illustrated how certain elements would be best measured and coded (e.g., how a variable such as person?s bank balance would be best measured and coded). As a result, whilst you may have to come up with some elements from scratch when creating a measurement instrument, you may be able to take others directly from previous studies.
Judgement plays a role because there are no right or wrong answers when it comes to selecting which elements should be included within a measurement procedure. Academics will often disagree over the most appropriate elements. However, for a measurement procedure to be content valid, the elements to be included will ideally have been selected with the help of experts in the field.
This leads us onto the representativeness of elements within a measurement procedure.
Representativeness reflects the extent to which your measurement procedure over-represents, under-represents or excludes the elements required to measure the construct you are interested in: (a) over- and under-representing elements, and (b) excluding elements. Each of these is discussed in turn:
Over- and under-representing elements
There is a particular danger of over-representing elements within a measurement procedure when the construct we are interested in (e.g., sleep quality) has multiple dimensions [NOTE: remember that constructs can have multiple dimensions].
For example, take the construct, sleep quality, whose content validity has been demonstrated through a questionnaire known as the Pittsburgh Sleep Quality Index (PSQI), a 19-item questionnaire that consists of 7 components: (1) subjective sleep quality, (2) sleep latency, (3) sleep duration, (4) habitual sleep efficiency, (5) sleep disturbances, (6) use of sleeping medication, and (7) daytime dysfunction (Buysse et al., 1989). Each of these 7 components aims to measure a different dimension of the construct, sleep quality. In assessing an individual?s sleep quality, these 19-items lead to a global score (i.e., a single score) between 0 and 21, where higher scores mean an individual experiences poorer sleep quality (i.e., greater sleep disturbance). However, whilst there are 19 items (i.e., 19 measures of sleep quality), there are 7 dimensions of sleep quality being studied (e.g., subjective sleep quality, sleep latency, etc.; see above for complete list). Therefore, some of these 7 dimensions will be addressed by more items than others.
The example above illustrates two problems:
In creating a measurement procedure to assess the construct you are interested in (e.g., sleep quality), there may be multiple dimensions you are trying to study. When this happens, some dimensions may require more measures (e.g., questions in a survey) than others (e.g., we may require 4 measures to assess the dimension, sleep duration, but only 1 measure to assess the dimension, use of sleeping medication). If we get the balance wrong; that is, if we use too many measures for one dimension and too few for another, we potentially over-represent or under-represent the elements required to capture the construct we are interested in when using our measurement procedure (i.e., the measurement procedure is not content valid).
We have to determine how to weight the scores that are attributed to a particular measure. For example, if our goal is to come up with a global score (i.e., a single score) for each participant from our measurement procedure (e.g., a global score of 0-21 in our sleep quality example), how do we achieve this? Do we give each measure an equal weighting? For example, do we take the 4 measures required to assess the dimension, sleep duration, and average them, or do we add them altogether? After all, if we added these 4 scores together, the total score for the dimension, sleep duration, would be much higher than the score for the dimension, use of sleeping medication, which only gives us 1 score (assuming the scales are the same). If we get this weighting wrong, we may over-represent or under-represent the elements required to capture the construct we are interested in when using our measurement procedure (i.e., the measurement procedure is not content valid).
Ultimately, the problem of over- and under-representation of elements is that the results that we get from our study may reflect those elements that are over- or under-represented, and not the construct that we are actually trying to measure.
Excluding elements
If we exclude elements from a measurement procedure when we shouldn?t have done, we risk missing important content that reflect the construct we are interested in. Whilst this may result in an over- or under-representation of certain elements, we are more concern about the results not giving us the whole picture about the construct we are interested in. This reduces the content validity of our measurement procedure.
As you may have gathered from this article, content validity and construct validity are strongly related. If you are still trying to get you head around how content validity may affect your dissertation, we would recommend you learn more about concepts, constructs and variables. A good starting point is the section on Concepts, constructs and variables. However, if you are comfortable with the idea of content validity, you may want to start thinking about how you can quantitatively assess different aspects of the content validity of your measurement procedure, using statistical tests such as principal component analysis (PCA) and factor analysis [see the Data Analysis section of Lærd Dissertation to find out what these statistical tests are and how to run, interpret and write them up].
Buysse, D. J., Reynolds, C. F. III., Monk, T. H, Berman, S. R., & Kupfer, D. J. (1989). The Pittsburgh Sleep Quality Index: A new instrument for psychiatric practice and research. Psychiatry Research, 28(2), 193-213.
Haynes, S. N., Richard, D. C. S., & Kubany, E. S. (1995). Content validity in psychological assessment: A functional approach to concepts and methods. Psychological Assessment, 7(3), 238-247.