Criterion validity (concurrent and predictive validity) – Research Strategy

There are many occasions when you might choose to use a well-established measurement procedure (e.g., a 42-item survey on depression) as the basis to create a new measurement procedure (e.g., a 19-item survey on depression) to measure the construct you are interested in (e.g., depression, sleep quality, employee commitment, etc.). This well-established measurement procedure acts as the criterion against which the criterion validity of the new measurement procedure is assessed. Like other forms of validity, criterion validity is not something that your measurement procedure has (or doesn't have). You will have to build a case for the criterion validity of your measurement procedure; ultimately, it is something that will be developed over time as more studies validate your measurement procedure. To assess criterion validity in your dissertation, you can choose between establishing the concurrent validity or predictive validity of your measurement procedure. These are two different types of criterion validity, each of which has a specific purpose. In this article, we first explain what criterion validity is and when it should be used, before discussing concurrent validity and predictive validity, providing examples of both.

What is criterion validity?

Criterion validity reflects the use of a criterion - a well-established measurement procedure - to create a new measurement procedure to measure the construct you are interested in. The criterion and the new measurement procedure must be theoretically related. The measurement procedures could include a range of research methods (e.g., surveys, structured observation, or structured interviews, etc.), provided that they yield quantitative data.

There are a number of reasons why we would be interested in using criterions to create a new measurement procedure: (a) to create a shorter version of a well-established measurement procedure; (b) to account for a new context, location, and/or culture where well-established measurement procedures need to be modified or completely altered; and (c) to help test the theoretical relatedness and construct validity of a well-established measurement procedure. Each of these is discussed in turn:

To create a shorter version of a well-established measurement procedure
You want to create a shorter version of an existing measurement procedure, which is unlikely to be achieved through simply removing one or two measures within the measurement procedure (e.g., one or two questions in a survey), possibly because this would affect the content validity of the measurement procedure [see the article: Content validity]. Therefore, you have to create new measures for the new measurement procedure. However, to ensure that you have built a valid new measurement procedure, you need to compare it against one that is already well-established; that is, one that already has demonstrated construct validity and reliability [see the articles: Construct validity and Reliability in research]. This well-established measurement procedure is the criterion against which you are comparing the new measurement procedure (i.e., why we call it criterion validity).
Indeed, sometimes a well-established measurement procedure (e.g., a survey), which has strong construct validity and reliability, is either too long or longer than would be preferable. A measurement procedure can be too long because it consists of too many measures (e.g., a 100 question survey measuring depression). Whilst the measurement procedure may be content valid (i.e., consist of measures that are appropriate/relevant and representative of the construct being measured), it is of limited practical use if response rates are particularly low because participants are simply unwilling to take the time to complete such a long measurement procedure. We also stated that a measurement procedure may be longer than would be preferable, which mirrors that argument above; that is, that it's easier to get respondents to complete a measurement procedure when it's shorter. However, the one difference is that an existing measurement procedure may not be too long (e.g., having only 40 questions in a survey), but would encourage much greater response rates if shorter (e.g., having just 18 questions). This may be a time consideration, but it is also an issue when you are combining multiple measurement procedures, each of which has a large number of measures (e.g., combining two surveys, each with around 40 questions).
To account for a new context, location and/or culture where well-established measurement procedures may need to be modified or completely altered
You are conducting a study in a new context, location and/or culture, where well-established measurement procedures no longer reflect the new context, location, and/or culture. As a result, there is a need to take a well-established measurement procedure, which acts as your criterion, but you need to create a new measurement procedure that is more appropriate for the new context, location, and/or culture. The new measurement procedure may only need to be modified or it may need to be completely altered. However, irrespective of whether a new measurement procedure only needs to be modified, or completely altered, it must be based on a criterion (i.e., a well-established measurement procedure).
For example, you may want to translate a well-established measurement procedure, which is construct valid, from one language (e.g., English) into another (e.g., Chinese or French). Since the English and French languages have some base commonalities, the content of the measurement procedure (i.e., the measures within the measurement procedure) may only have to be modified. However, such content may have to be completely altered when a translation into Chinese is made because of the fundamental differences in the two languages (i.e., Chinese and English). Nonetheless, the new measurement procedure (i.e., the translated measurement procedure) should have criterion validity; that is, it must reflect the well-established measurement procedure upon which is was based.
In research, it is common to want to take measurement procedures that have been well-established in one context, location, and/or culture, and apply them to another context, location, and/or culture. Criterion validity is a good test of whether such newly applied measurement procedures reflect the criterion upon which they are based. When they do not, this suggests that new measurement procedures need to be created that are more appropriate for the new context, location, and/or culture of interest.
To help test the theoretical relatedness and construct validity of a well-established measurement procedure
It could also be argued that testing for criterion validity is an additional way of testing the construct validity of an existing, well-established measurement procedure. After all, if the new measurement procedure, which uses different measures (i.e., has different content), but measures the same construct, is strongly related to the well-established measurement procedure, this gives us more confidence in the construct validity of the existing measurement procedure.

Criterion validity is demonstrated when there is a strong relationship between the scores from the two measurement procedures, which is typically examined using a correlation. For example, participants that score high on the new measurement procedure would also score high on the well-established test; and the same would be said for medium and low scores.

However, rather than assessing criterion validity, per se, determining criterion validity is a choice between establishing concurrent validity or predictive validity. There are two things to think about when choosing between concurrent and predictive validity:

The purpose of the study and measurement procedure
You need to consider the purpose of the study and measurement procedure; that is, whether you are trying (a) to use an existing, well-established measurement procedure in order to create a new measurement procedure (i.e., concurrent validity), or (b) to examine whether a measurement procedure can be used to make predictions (i.e., predictive validity).
Study constraints
Testing for concurrent validity is likely to be simpler, more cost-effective, and less time intensive than predictive validity. This sometimes encourages researchers to first test for the concurrent validity of a new measurement procedure, before later testing it for predictive validity when more resources and time are available.