Sampling is an important component of any piece of research because of the significant impact that it can have on the **quality** of your **results/findings**. If you are new to sampling, there are a number of **key terms** and **basic principles** that act as a foundation to the subject. This article explains these key terms and basic principles. Rather than a comprehensive look at sampling, the article presents the **sampling basics** that you would need to know if you were an undergraduate or master's level student about to perform a dissertation (or similar piece of research). It also provides links to other articles within the Sampling Strategy section of this website that you may find useful. Some of the key sampling terms you will come across include **population**, **units**, **sample**, **sample size**, **sampling frame**, **sampling techniques** and **sampling bias**. Each is discussed in turn:

The word **population** is different when used in research compared with the way we think about a population under normal circumstances. Typically, we refer to the population of a country (or region), such as the United States or Great Britain. However, in research (and the **theory of sampling**), the word **population** has a different meaning. In sampling, a population signifies the units that we are interested in studying. These **units** could be **people**, **cases** and **pieces of data**. Some examples of each of these types of population are present below:

People

Students enrolled at a university (e.g., Harvard University) or studying a particular course (e.g., Statistics 101)

United States Senators or Congressman who are Democrats

Users of Facebook or Twitter

Presidents and CEOs of Fortune 500 or FTSE 100 companies

Nurses working at hospitals in the State of TexasCases (i.e., organisations, institutions, countries, etc.)

Recruitment agencies in Greater London, England

Law firms in Manhattan, New York, United States

The World Trade Organisation (WTO)

The European Parliament

Countries that are members of NATO

Signatories of the Helsinki AccordPieces of data

Customer transactions at Wal-Mart or Tesco between two time points (e.g., 1st April 2009 and 31st March 2010)

The breaking distances (in kpm/m) of a particular model of car

University applications in the United States in 2011

Households with broadband subscriptions in the town of Carmarthen, Wales

When thinking about the population you are interested in studying, it is important to be **precise**. For example, if we say that our population is **users of Facebook**, this would imply that we were interested in all 500 million (or more) Facebook users, irrespective of what country they were in, whether they were male or female, what age they were, how often they used Facebook, and so forth. However, if the population you were interested in was **more specific**, you should make this clear. Perhaps our population is not **Facebook users**, but **frequent, male Facebook users in the United States**. When we come to **describe** our population further, we would also need to define what we meant by **frequent users** (e.g., people that log in to Facebook at least once a day).

As discussed above, the **population** that you are interested consists of **units**, which can be **people**, **cases** or **pieces of data**. These terms can sometimes be used interchangeably. In this website, we use the word **units** whenever we are referring to those things that make up a population. However, since you may find other textbooks referring to these units as people, cases, or pieces of data, we have provided some further clarification below:

The population you are interested in consists of one or more units. For example, if the population we were interested in was all 500 million (or more) Facebook users,

**each**of these Facebook users would be**a unit**. So we would have 500 million (or more) units in our population. If we were interested in CEOs (or Presidents) of Fortune 500 companies, the**CEOs**(or**Presidents**) would be our units.Sometimes the word

**units**is replaced with the word**cases**. As highlighted in the population examples above, sometimes the populations we are interested in are organisations, institutions and countries. In such cases, it is often more appropriate to refer to**each**of these (e.g., recruitment agencies, law firms) as**cases**. You may be interested in a population that consists of only**one case**(e.g., the World Trade Organisation or European Parliament) or maybe you are interested in a population that has**many cases**(e.g., recruitment agencies in London, of which there must be hundreds).Finally, researchers sometimes refer to populations consisting of

**data**(or**pieces of data**) instead of**units**or**cases**. For example, researchers may be interested in**customer transactions**at a particular supermarket (e.g., Wal-Mart or Tesco) between two time points (e.g., 1st April 2009 and 31st March 2010); perhaps because they want to examine the effect of certain promotions on sales figures.

When we are interested in a population, it is often **impractical** and sometimes **undesirable** to try and study the **entire** population. For example, if the population we were interested in was **frequent, male Facebook users in the United States**, this could be **millions** of users (i.e., millions of units). If we chose to study these Facebook users using structured interviews (i.e., our chosen research method), it could take a lifetime. Therefore, we choose to study just a **sample** of these Facebook users.

Whilst we discuss more about sampling and why we sample later in this article, the important point to remember here is that a sample consists of only those units (in this case, Facebook users) from our population of interest (i.e., X million frequent, male, Facebook users in the United States) that we actually study (e.g., 500 or 1000 of these Facebook users).

The **sample size** is simply the number of units in your sample. In the example above, the sample size selected may be just 500 or 1000 of the Facebook users that are part of our population of **frequent, male, Facebook users in the United States**.

In practice, the sample size that is selected for a study can have a significant impact on the **quality** of your **results/findings**, with sample sizes that are either **too small** or **excessively large** both potentially leading to incorrect findings. As a result, **sample size calculations** are sometimes performed to determine **how large** your sample size needs to be to avoid such problems. However, these calculations can be complex, and are typically not performed at the undergraduate and master's level when completing a dissertation.

The **sampling frame** is very similar to the population you are studying, and may be **exactly the same**. When selecting units from the population to be included in your sample, it is sometimes desirable to get hold of a **list of the population** from which you select units. This is the case when using certain types of **sampling technique** (i.e., **probability sampling techniques**), which we discuss later in the article. This **list** can be referred to as the **sampling frame**. We explain more about sampling frames in the article: Probability sampling.

Sampling bias occurs when the units that are selected from the population for inclusion in your sample are **not characteristic of** (i.e., do not reflect) the population. This can lead to your sample being **unrepresentative** of the population you are interested in.

For example, you want to measure **how often residents in New York go to a Broadway show in a given year**. Clearly, standing along Broadway and asking people as they pass by how often they went to Broadway shows in a given year would not make sense because a higher proportion of those passing by are likely to have just come out of a show. The sample would therefore be **biased**.

For this reason, we have to think carefully about the types of sampling techniques we use when selecting units to be included in our sample. Some sampling techniques, such as **convenience sampling**, a **type** of **non-probability sampling** (which reflected the Broadway example above), are prone to **greater bias** than **probability sampling techniques**. We discuss sampling techniques further next.

As we have mentioned above, when we are interested in a population, we typically study a sample of that population rather than attempt to study the whole population (e.g., just 500 of the X million frequent, male Facebook users in the United States). If we imagine that our desired sample size was just 500 of these Facebook users, the question arises: How do we know what Facebook users to invite to take part in our sample? In other words, what Facebook users will become part of our sample?

The purpose of **sampling techniques** is to help you select units (e.g., Facebook users) to be included in your sample (e.g., of 500 Facebook users). Broadly speaking, there are two **groups** of sampling technique: **probability sampling techniques** and **non-probability sampling techniques**.

Probability sampling techniques

Probability sampling techniques use

**random selection**(i.e.,**probabilistic methods**) to help you select units from your sampling frame (i.e., similar or exactly that same as your population) to be included in your sample. These**procedures**(i.e.,**probabilistic methods**) are very clearly defined, making it easy to follow them. Since the**characteristics**of the sample researchers are interested in vary, different**types**of probability sampling technique exist to help you select the appropriate units to be included in your sample. These types of probability sampling technique include**simple random sampling**,**systematic random sampling**,**stratified random sampling**and**cluster sampling**.We discuss probability sampling in more detail the article, Probability sampling. We also discuss each of these different types of probability sampling technique, how to carry them out, and their advantages and disadvantages [see the articles: Simple random sampling, Systematic random sampling and Stratified random sampling].

Non-probability sampling techniques

Non-probability sampling techniques refer on the

**subjective judgement**of the researcher when selecting units from the population to be included in the sample. For some of the different types of non-probability sampling technique, the**procedures**for selecting units to be included in the sample are very clearly defined, just like probability sampling techniques. However, in others (e.g.,**purposive sampling**), the**subjective judgement**required to select units from the population, which involves a combination of**theory**,**experience**and**insight from the research process**, makes selecting units more complicated. Overall, the types of non-probability sampling technique you are likely to come across include**quota sampling**,**purposive sampling**,**convenience sampling**,**snowball sampling**and**self-section sampling**.We discuss non-probability sampling in more detail in the article, Non-probability sampling. We also discuss each of these different types of non-probability sampling technique, how to carry them out, and their advantages and disadvantages [see the articles: Quota sampling, Purposive sampling, Convenience sampling, Snowball sampling and Self-selection sampling].

If you want to know more about the sampling techniques you may use in your dissertation, read up on probability sampling and non-probability sampling.