If the experiment in your dissertation focuses on people (i.e., people are the population you are interested in), maturation is likely to threaten the internal validity of your findings. This has to do with time and the effect that time has on people. After all, experiments do not happen overnight, but often over a period of time, whether days, weeks, a few months, or in some cases, years. Whilst experiments at the undergraduate and master's dissertation level tend to last no longer than 2-3 months (at least the data collection phase), there are a number of changes that can take place within such short timeframes. During such periods of time, people change, and such change can affect your findings. This is the case for all types of experiment, whether in the physical or social sciences, psychology, management, education, or another field of study. Let's look at some examples of maturation effects in the short-term and long-term:
Short-term changes and their effects
There are a number of maturation effects that can occur during the very short term; that is, within a few hours or days. People's behaviour can change. For example, they can go from being in a good mood or a bad one. Factors such as subject tiredness, boredom, hunger and inattention can also occur. These factors can be driven by the research participant or the experiment. The participant may have stayed up late the night before an experiment, causing tiredness; the participant may be thinking about an upcoming coursework deadline or exam, causing inattention; and so forth. Such participant-led factors can be difficult to control, reducing the internal validity of an experiment. However, sometimes these factors (i.e., tiredness, boredom, hunger, inattention, etc.) are the result of the experiment.
Longer-term changes and their effects
Other maturation effects can result from longer term changes, such as getting older, becoming better educated, become more affluent, and so forth. However, even within experiments lasting less than a year, and perhaps even just a few months, it is possible for these factors to affect your findings. For example, people can get a new job with a relatively significant pay rise, or they may come into some inheritance money. They may start taking some form of further education, whether within the classroom, at home, or in work. At the same time, getting older can be an issue. Indeed, experiments that focus on people that are elderly, as well as those that involve young children have the potential to suffer from maturation effects because small changes in age can have a particularly marked impact on a range of physical, social, behavioural, and psychological factors. For example, as people become elderly, there can be a more rapid deterioration in certain physical characteristics such as vision, hearing, taste, and even memory. This may negatively impact their performance during an experiment. Amongst young children, there is a greater propensity for learning to take place (acquiring new knowledge and skills), as well as becoming stronger, stronger, and tasting in a short space of time. Such maturation effects, in addition to (or rather than) the treatment condition, may change the performance of participants in the post-test relative to the pre-test.
The question arises: How confident are you that the observed changes in the dependent variable are due to the treatment (i.e., intervention) and not maturation? In principle, such confidence will decrease as the experiment goes on. However, it is not as simple as saying that the longer an experiment, the greater the potential maturation effect. You need to look at the nature of your research, and examine whether maturation is likely to be a problem.
Testing effects, also known as order effects, only occur in experimental and quasi-experimental research designs that have more than one stage; that is, research designs that involve a pre-test and a post-test. In such circumstances, the fact that the person taking part in the research is tested more than once can influence their behaviour/scores in the post-test, which confounds the results; that is, the differences in scores on the dependent variable between the groups being studied may be due to testing effects rather than the independent variable. Some of the reasons why testing effects occur include learning effects (practice or carry-over effects) and experimental fatigue. Each is discussed in turn:
Learning effects (practice or carry-over effects)
Learning effects, also known as practice effects or carry-over effects result in increased post-test performance (i.e., higher scores on the dependent variable) because participants have become familiar with some aspect of the experiment (e.g., its subject matter) from the pre-test. As a result of these learning effects, during the post-test, participants may:
Understand the format of the experiment
Understand the purpose of the experiment
Become familiar with the testing environment
Develop a strategy/approach to do better/worse in the experiment (or moderate their outcome)
Become less anxious about the experiment
Where learning effects relate to the measurement procedure (e.g., a, b and c above), this is often called habituation. Where such learning effects relate to memory effects (e.g., d and e above), this is often called sensitization.
Experimental fatigue / General experiences during the experiment
Experimental fatigue reflects general experiences that take place during the experiment that lead to physical and/or mental fatigue. This could be due to a particular treatment, which may be physical and/or mentally demanding, or simply due to the fact that being part of a research project, which is unusual for most participants, can be tiring.
Testing effects are not a problem in all studies. For example, as a "general rule of thumb", testing effects are less likely to be a threat to internal validity where there has been a large time period between the pre-test and post-test compared with experiments having a short interval between tests. You need to ask yourself: To what extent are learning effects a problem for the post-test in my experiment?
Instrumentation can be a threat to internal validity because it can result in instrumental bias (or instrumental decay). Such instrumental bias takes place when the measuring instrument (e.g., a measuring device, a survey, interviews/participant observation) that is used in a study changes over time. Instrumentation becomes a threat to internal validity when it reduces the confidence that the changes (differences) in the scores on the dependent variable may be due to instrumentation and not the treatments (i.e., the independent variable). It sometimes helps to think about instrumental bias arising either because of the use of a physical measuring device or the actions of the researcher. Each is discussed in turn:
A physical measurement device
The measurement device in your experiment may be a piece of equipment or some other physical device (e.g., a stopwatch, weighing scales, a speedometer, etc.). Let's look at any example of how such a measurement device can decay over time.
Study #3
How do different types of tennis serve impact on the speed of the tennis ball?
Imagine that we were interested in examining how different types of tennis serve (i.e., the independent variable) impacted on the speed of the tennis ball (i.e., the dependent variable). We choose to compare two types of tennis serve (e.g., slice and top spin), measuring the speed of the tennis ball using a speedometer (i.e., the speed is calculated in miles per hour, mph). For the sake of this example, let's imagine that (a) the amount of power that goes into the serve is the same every time, with the balls shot out from a ball machine, and (b) the speedometer is accurate and automatic (i.e., doesn't rely on a human to operate it, like a speed camera on the road). However, there is still one part of the measurement device that can gradually change over the course of the experiment, creating instrumental bias. This is the gradual wear and tear of the tennis ball as it deteriorates with every serve of the ball. If you've ever watched Wimbledon or the US Open, you'll see the tennis balls being replaced every few games. We may not think about the tennis ball as a specific measurement device such as the speedometer that measures the speed of the ball, but it is part of the measurement device for the experiment. The ball machine and the speedometer cannot record how worn out the ball is, and it may be very difficult to measure this accurately. As the tennis ball wears, especially if the amount of wear is different when performing one type of serve (e.g., a slice service) compared to the another type of serve (e.g., a top spin serve), the amount of difference in the scores on the dependent variable (i.e., the speed of the tennis ball) will not only be due to the difference in the type of serve, but also the amount of wear and tear on the ball (i.e., the extent of the wear and tear on a ball may vary depending on the type of serve (e.g., a top spin serve wearing out a ball faster than a slice serve).
This instrumental bias becomes a threat to the internal validity of the experiment, creating another possible explanation (i.e., the wear, or relative wear of the tennis ball) for the differences in the scores on the dependent variable (i.e., the speed of the ball) other than just the differences in the independent variable (i.e., the differences between the two types of tennis serve).
Furthermore, measurement devices do not always the same level of measurement precision. For example, when we think about a speedometer, we would expect it to be as accurate when recording a speed of 100mph as 20mph. However, for some measurement devices, this is not the case. They can be less precise when recording some values compared with others. When a measurement device is thought to have low precision for values that are high, this is known as a ceiling effect. When the level of precision is low for values that are low on the measurement device, this is called a floor effect. Both ceiling effects and floor effects are types of instrumental bias that can threaten the internal validity of your study.
The actions of the researcher
Sometimes, we can think of the measurement device as the researcher collecting the data, since it is the researcher that is making the assessment of the measurement. This is more likely to occur in qualitative research designs than quantitative research because qualitative research generally involves less structured and less standardised measurement procedures, such as unstructured and semi-structured interviews and observations. However, quantitative research also involves research methods where the score that is given on a particular measurement instrument is determined by the researcher.
For example, let's imagine that a researcher is using structured, participant observation, to assess social awkwardness (i.e., the dependent variable) in two different types of profession (i.e., the independent variable). For simplicity, let's imagine that the researcher monitors these two different groups of employees, and scores their level of social awkwardness on a scale of 1-10 (e.g., 10 = extremely socially awkward).
The way that a researcher scores may change during the course of an experiment for two reasons: First, the researcher can gain in experience (i.e., become more proficient) or become fatigued during the course of the experiment, which affects the way that observations are recorded. This can happen across groups, but also within a single group (even pre- and post-tests). Second, a different researcher may be used for the pre-test and post-test measurement. In quantitative research using structured, participant observation, it is important to consider the ability/experience of the researchers, and how this, or other factors relating to the researcher's scoring, may change over time. However, this will only lead to instrumental bias if the way that the researcher scores is different for the groups that are being measured (e.g., the control group versus the treatment group).
Instrumentation is more likely to become an issue over time since there is greater potential for instrumental decay to occur.