Construct validity can be viewed as an overarching term to assess the validity of the measurement procedure (e.g., a questionnaire) that you use to measure a given construct (e.g., depression, commitment, trust, etc.). If you are unsure what we mean by terms such as constructs, variables, and conceptual and operational definitions, we would recommend that you first read the articles in the section on Constructs in quantitative research.
Construct validity is considered an overarching term to assess the measurement procedure used to measure a given construct because it incorporates a number of other forms of validity (i.e., content validity, convergent and divergent validity, and criterion validity) that help in the assessment of such construct validity (Messick, 1980). It is for this reason that construct validity is viewed as a process that you go through to assess the validity of a measurement procedure, whilst a number of other forms of validity are procedures (or tools) that you use to more practically assess whether the measurement procedure measures a given construct (Wainer & Braun, 1988). We explain this distinction, as well as the relationship between construct validity and other forms of validity, in the first section of this article: What is construct validity?
Overall, you should be aware that even if a measurement procedure is shown to have strong construct validity, this is something that develops gradually over time. You cannot say that a measurement procedure has permanently or absolutely established construct validity. Rather, this is an ideal. With each additional study that shows a measurement procedure to have strong construct validity, especially in a wide range of contexts/situations, the claim of strong construct validity becomes greater. In this article, we (a) explain what construct validity is, (b) discuss the various threats to construct validity that you may face; and (c) show you where you can found out more.
As briefly discussed above, construct validity can be viewed as an overarching term to assess the validity of the measurement procedure (e.g., a questionnaire) that you use to measure a given construct (e.g., depression, commitment, trust, etc.). This is because it incorporates a number of other forms of validity (i.e., content validity, convergent and divergent validity, and criterion validity) that help in the assessment of such construct validity (Messick, 1980). In this sense, construct validity is a process that you work through, involving a number of procedures (i.e., tests of validity, such as content validity, convergent validity, etc.) to assess the validity of the measurement procedure that you use in your dissertation to measure a given construct.
For example, let's imagine that we were interested in studying the construct, post-natal depression. In order to do this, new mothers taking part in the research were asked (a) to complete a 10-question survey (i.e., as a form of self-assessment) to assess various characteristics of post-natal depression, and (b) to be observed (i.e., participant observation) by trained psychiatric nurses, who used a scale to measure these different characteristics of post-natal depression. When assessing the construct validity of these two measurement procedures to measure the construct, post-natal depression, we would want to know:
Are the elements/questions used in the 10-question survey and the participant observation scale relevant and representative of the construct, post-natal depression, which they were supposed to be measuring? In terms of relevance, are the elements/questions appropriate considering the purpose of the study and the theory from which they are drawn? Furthermore, does the measurement procedure include all the necessary elements/questions? Is there an appropriate balance of elements, or are some over- or under-represented? This reflects the desire to assess the content validity of the measurement procedure [see the article: Content validity].
Do the 10 questions and participant observation scale only measure the construct we are interested in (i.e., post-natal depression), and not one or more additional constructs; perhaps constructs such as post-partum mood, stress or anxiety? After all, when assessing the construct validity of a measurement procedure, we should not only check that the contents (i.e., elements) are relevant and representative of the construct we are interested in, but also that the measurement procedure is not measuring something that is should not be measuring. When this happens, the results can be confounded, which threatens the internal validity and external validity of your study [see the articles: Internal validity and External validity]. This reflects the desire to assess the divergent validity of the measurement procedure [see the article: Convergent and divergent validity].
Since the study used two different measurement procedures, how confident can we be that both measurement procedures were measuring the same construct (i.e., post-natal depression)? If both measurement procedures were new (i.e., you created them for your dissertation), we would want to assess their convergent validity, but if one was new (e.g., the 10-question survey), but the other was well-established (e.g., the participant observation scale), we would assess their concurrent validity [see the articles: Convergent and divergent validity and Criterion validity: (concurrent and predictive validity)].
Do the scores from the two measurement procedures used make accurate predictions (i.e., both theoretically and logically) about the construct they represent (i.e., post-natal depression)? This reflects the desire to assess the predictive validity of the measurement procedure [see the article: Criterion validity: (concurrent and predictive validity)].
Ultimately, for construct validity to exist, there needs to be (a) a clear link between the construct you are interested in and the measures and interventions that are used to operationalize it (i.e., measure it), and (b) a clear distinction between different constructs (Cronbach and Meehl, 1955; Nunnally, 1978). This involves creating clear and precise conceptual and operational definitions of the constructs you are interested in [see the section on Constructs in quantitative research], as well as performing various tests of validity.
You will not be able to demonstrate construct validity in a single study, although it is good practice, and valued by dissertations supervisors, when you approach a study wanting to establish as much construct validity as possible. Clearly, there are some tests of validity that you will need to carry out during your study that will help to improve the construct validity of your measurement procedure (e.g., content validity). It may also be possible, and will certainly be desirable, to carry out other tests of validity that will give you more confidence that your measurement procedure is construct valid (e.g., convergent and divergent validity, and concurrent and predictive validity). However, one of the more difficult assessments of construct validity during a single study, which is extremely important, but less likely to be carried out, is the need to ensure that the scores that are attained from your measurement procedure for a given construct behave in a way that is consistent with that construct. For example, imagine that you are interested in the construct, post-natal depression, and want to create a single measurement procedure to measure post-natal depression. Imagine also that a number of studies have shown that another construct, financial stress, is strongly related to post-natal depression; that is, as financial stress increases, post-natal depression increases by a certain amount. Whilst your dissertation on post-natal depression may not have looked at financial stress at all, you need to show that the scores you obtained from your measurement procedure are consistent with the scores (i.e., behaviour) from the related construct (i.e., financial stress).
In the section that follows, we discuss potential threats to construct validity.