Systematic random sampling is a type of probability sampling technique [see our article Probability sampling if you do not know what probability sampling is]. With the systematic random sample, there is an equal chance (probability) of selecting each unit from within the population when creating the sample. The systematic sample is a variation on the simple random sample. Rather than referring to random number tables to select the cases that will be included in your sample, you select units directly from the sample frame [see our article, Sampling: The basics, if you are unsure about the terms unit, sample, sampling frame and population]. This article explains (a) what systematic random sampling is, (b) how to create a systematic random sample, and (c) the advantages and disadvantages (limitations) of systematic random sampling.
Imagine that a researcher wants to understand more about the career goals of students at the University of Bath. Let's say that the university has roughly 10,000 students. These 10,000 students are our population (N). In order to select a sample (n) of students from this population of 10,000 students, we could choose to use a systematic random sample.
With systematic random sampling, there would an equal chance (probability) that each of the 10,000 students could be selected for inclusion in our sample. Each of the 10,000 students is known as a unit, a case or an object (these terms are sometimes used interchangeably; we use the word unit). If our desired sample size was around 400 students, each of these students would subsequently be sent a questionnaire to complete (imagining we choose to collect our data using a questionnaire).
To create a systemic random sample, there are seven steps: (a) defining the population; (b) choosing your sample size; (c) listing the population; (d) assigning numbers to cases; (e) calculating the sampling fraction; (f) selecting the first unit; and (g) selecting your sample.
In our example, the population is the 10,000 students at the University of Bath. The population is expressed as N. Since we are interested in all of these university students, we can say that our sampling frame is all 10,000 students. If we were only interested in female university students, for example, we would exclude all males in creating our sampling frame, which would be much less than 10,000.
Let's imagine that we choose a sample size of 100 students. The sample is expressed as n. This number was chosen because it reflects the limit of our budget and the time we have to distribute our questionnaire to students. However, we could have also determined the sample size we needed using a sample size calculation, which is a particularly useful statistical tool. This may have suggested that we needed a larger sample size; perhaps as many as 400 students.
To select a sample of 100 students, we need to identify all 10,000 students at the University of Bath. If you were actually carrying out this research, you would most likely have had to receive permission from Student Records (or another department in the university) to view a list of all students studying at the university. You can read about this later in the article under Disadvantages (limitations) of systematic random sampling.
We now need to assign a consecutive number from 1 to N, next to each of the students. In our case, this would mean assigning a consecutive number from 1 to 10,000 (i.e. N = 10,000; your population of students at the university).
Assuming we have chosen a sample size of 100 students, we now need to work out the sampling fraction, which is simply the sample size selected (expressed as n) divided by the population size (N). In this case:
The sampling fraction tells us that we need to select 1 student in every 100 students from the population of 10,000 students at the university. After doing this 100 times, we will have our sample of 100 students. However, first we need to select the first unit (i.e., the first student), which starts the process of creating our sample.
Since we need to select 1 student in every 100 students, first we use a random number table to select the first student. Imagine the first number in the random number table was 0009, we would ignore the first three digits and focus on the last digit, 9, since this number fits between 0 and 100. As such, our first student would be the 9th on our list of 10,000 students.
Now that we know the first unit, namely the 9th student on the list, we can select the other 99 students to make up our sample of 100 students. Since we need to select 1 student in every 100 students from the list, we use the 9th student as the starting point and then select every 100th student from this point. As such, we select the 109th student on the list, the 209th student, the 309th student, and so forth.
The advantages and disadvantages (limitations) of systematic random sampling are explained below. Many of these are similar to other types of probability sampling technique, but with some exceptions. Whilst systematic random sampling is one of the "gold standards" of sampling techniques, it presents many challenges for students conducting dissertation research at the undergraduate and master's level.
Advantages of systematic random sampling
The aim of the systemic random sample is to reduce the potential for human bias in the selection of cases to be included in the sample. As a result, the systemic random sample provides us with a sample that is highly representative of the population being studied, assuming that there is limited missing data.
Since the units selected for inclusion within the sample are chosen using probabilistic methods, systemic random sampling allows us to make statistical conclusions from the data collected that will be considered to be valid.
Relative to the simple random sample, the selection of units using a systematic procedure can be viewed as superior because it improves the potential for the units to be more evenly spread over the population.
Disadvantages (limitations) of systematic random sampling
A systematic random sample can only be carried out if a complete list of the population is available.
If the list of the population has some kind of standardised arrangement (order/pattern), systematic sampling could pick out similar cases rather than completely random ones. For example, when Student Records put together the list of the 10,000 students (our example), the list may have been ordered so that each record moved from a male to female student (i.e., record #1 was a male student, record #2 a female student, record #3 a male student again, and so forth). This may have been intentional or unintentional. Either way, if we select the 9th student in every hundred from the list (as per our example; i.e., the 9th, 109th, 209th student, and so forth), we will always select a male student (i.e., all odd numbers in the list are male students, whilst all even numbers are female students). This will lead to a very biased sample. In reality, such a bias in the list should be easily seen and corrected. However, sometimes such a standardised arrangement (order/pattern) may not be obvious or visible, resulting in sampling bias.
Attaining a complete list of the population can be difficult for a number of reasons:
Even if a list is readily available, it may be challenging to gain access to that list. The list may be protected by privacy policies or require a length process to attain permissions.
There may be no single list detailing the population you are interested in. As a result, it may be difficult and time consuming to bring together numerous sub-lists to create a final list from which you want to select your sample. As an undergraduate and master?s level dissertation student, you may simply not have sufficient time to do this.
Many lists will not be in the public domain and their purchase may be expensive; at least in terms of the research funds of a typical undergraduate or master's level dissertation student.
In terms of human populations (as opposed to other types of populations; see the article: Sampling: The basics), some of these populations will be expensive and time consuming to contact, even where a list is available. Assuming that your list has all the contact details of potential participants in the first instance, managing the different ways (postal, telephone, email) that may be required to contact your sample may be challenging, not forgetting the fact that your sample may also be geographical scattered.
In the case of human populations, to avoid potential bias in your sample, you will also need to try and ensure that an adequate proportion of your sample takes part in the research. This may require re-contacting non-respondents, can be very time consuming, or reaching out to new respondents.