Content validity

Content validity is the extent to which the elements within a measurement procedure are relevant and representative of the construct that they will be used to measure (Haynes et al., 1995). Establishing content validity is a necessarily initial task in the construction of a new measurement procedure (or revision of an existing one). However, the validity (e.g., construct validity) and reliability (e.g., internal consistency) of the content (i.e., elements) selected should be tested before an assessment of content validity can be made. If you are unfamiliar with the idea of concepts and constructs in research, it is probably worth you first reading the section on Concepts, constructs and variables. In this article, we explain what content validity is, providing some examples. We do this by discussing the relationship between constructs and content validity, as well as highlighting to important aspects of content validity: relevance and representativeness.

Constructs and content validity?

The operational definition of some constructs can be very straightforward, making it relatively easy to be confident that a measurement procedure (e.g., a survey, structured observation, structured interviews) is content valid. For example, we can suggest measuring the construct height using centimetres, or a person's weight using kilograms. These are operational definitions of constructs that are quite obvious, where it is easy to come up with a single operational definition. However, it is often far more challenging to create reliable operational definitions for more complex constructs like anger, depression, motivation, and task performance [see the section on Concepts, constructs and variables]. The relative complexity of these types of construct reflects a number of factors: (a) the number of dimensions and measures a construct has; (b) the number of ways constructs can be operationally defined; and (c) the potential for a construct to be confounded. Each is discussed in turn:

The number of dimensions and measures a construct has

Simple constructs such as weight and height are fairly one-dimensional, but other more complex constructs are multi-dimensional. By multi-dimensional, we mean that these more complex constructs (e.g., anger, depression, motivation, sleep quality, etc.) consist of a number of components, each of which describe a different aspect of the construct.

For example, take the construct, sleep quality, whose content validity has been demonstrated through a questionnaire known as the Pittsburgh Sleep Quality Index (PSQI), a 19-item questionnaire that consists of 7 components: (1) subjective sleep quality, (2) sleep latency, (3) sleep duration, (4) habitual sleep efficiency, (5) sleep disturbances, (6) use of sleeping medication, and (7) daytime dysfunction (Buysse et al., 1989). Each of these 7 components aims to measure a different dimension of the construct, sleep quality.

This leads onto the idea of the number of measures a construct has. Simple constructs like weight and height may have just one measure (e.g., kilograms, centimetres, etc.). However, for more complex constructs, multiple measures may be required, each with different elements. Note that elements are all of those aspects of the measurement procedure that affect the data being collected. In terms of measures, these elements include things like questionnaire items (e.g., the number of questions used for each dimension of a construct) and coding criteria (i.e., what types of measures are used, including factors like the types of variables - nominal and continuous variables - and the scales used - continuous scales, Likert scales, and so forth). The more dimensions and measures a construct has, the more difficult it is likely to be to ensure that the measurement procedure you are trying to create is content valid.

The number of ways constructs can be operationally defined

The operational definitions of constructs are based on the concepts you are trying to study [see the section on Concepts, constructs and variables]. However, concepts can be studied using a wide range of constructs, and these constructs, in turn, can be explained using a number of operational definitions.

For example, the concept of poverty could be viewed from a range of perspectives (e.g., poverty gap, economic poverty, poverty and welfare, etc.). When we focus in on one of these perspectives of poverty, we may choose to create a measurement procedure to examine the construct, economic poverty. However, there are a number of ways that the construct, economic poverty, can be operationally defined. For example, we could use an operational definition that examines how a person behaves through some characteristic of economic poverty (e.g., deprive the person of a wage for a month and measure how they respond/cope), or look at the intrinsic properties of economic poverty (e.g., how much money a person has in the bank, how indebted they are, etc.). However, these different operational definitions will affect (a) the way that the construct, economic poverty, is measured and (b) the way that we interpret the results about economic poverty.

From the above example, we can see that the ways that constructs are operationally defined may reflect (a) the context in which a construct is being applied (e.g., types of poverty and their relative operational definitions), but also (b) a general lack of agreement between academics concerning the content of a particular construct (i.e., what elements should and should not be included in a given construct). These factors make is more difficult to ensure that the measurement procedure you are trying to create is content valid.

The potential for a construct to be confounded

There is a lot of ambiguity not only in the way that constructs can be operationally defined, but also how different constructs relate to one another (e.g., how the construct, anger, relates to the construct, depression). What are the boundaries between these different constructs? Where does one construct start and the other end?

When different construct overlap, the results that we generate when measuring these construct can become confounded [see the article: Extraneous and confounding variables]. The same can be said about the content validity of a measurement procedure. How do we know that the elements we include in a measurement procedure are relevant and representative of the construct we are trying to measure?