It is very common for students to come unstuck when it comes to analysing their data. There are typically two reasons for this: **(a)** a lack of statistical knowledge, which we try to compensate for in the Data Analysis part of Lærd Dissertation; and **(b)** the collection of insufficient data (i.e., your sample size it too small), which means that you have **insufficient statistical power**. In basic terms, this simply means that the standard statistical tests that you usually use to analyse your data will not be able to find differences that actually exist.

For example, imagine that 100 of the same experiments showed that the alternative hypothesis, **Males break the speed limit more than females**, was true (i.e., the differences in the speed limits recorded between males and females showed a statistically significant difference, with males breaking the speed limit more than females). This would lead us to conclude with quite a strong degree of certainty that the alternative hypothesis was true (i.e., we say that we can **accept** the alternative hypothesis rather than using the word **true**). Now imagine that you conducted the experiment with a new, representative sample of male and female drivers, but the sample was very small (e.g., you could only get 8 males and 6 females to take part). The hypothesis that **males break the speed limit more than females** is still true, but with such a small sample size, the statistical tests that you ran on the data lacked the statistical power required to show that such differences existed. This is known as a **Type II error**, meaning that the null hypothesis is accepted instead of the alternative hypothesis when it shouldn't have been (i.e., the null hypothesis would be: **There is no difference in breaking the speed limit between males and females**).

Type II errors are seriously problematic in undergraduate and master's level dissertations because **(a)** the sample sizes are often small, partly because of resource constraints, but also a little bit of laziness (just being honest!), and **(b)** you are rarely expected to perform **power calculations** and **sample size calculations** at this level (i.e., these tests allow you to calculate the sample size you need to ensure that the chance of detecting a difference, if one exists, is at an acceptable level, helping you to reduce the potential for a Type I error, as well as reducing the need to collect a larger sample than needed). There is no doubt that a lot of dissertations end up looking as though they found nothing because a large proportion of alternative hypotheses (and in some cases, all of the hypotheses) are not statistically significant (i.e., even though some should have been). This not only leads to significant criticism of your findings, and possibly a low mark, but also very little to write about on the page. In such situations, in order to show that the dissertation produced **some** findings, students often end up including lots of descriptive statistics and cross-tabulations that do not adequate answer their research hypotheses (i.e., because inferential statistical tests were required), but put some writing on the page.

To avoid falling into this trap, make sure that you collect sufficient data. Whilst you do not want an excessively large sample because **(a)** this can be viewed as unethical (i.e., putting too many people through any experiment unnecessarily, even when you have their informed consent), and **(b)** it can create its own problems (i.e., the potential for a **Type I error**, rejecting the null hypothesis when you shouldn't have; that is, finding that there is a difference when one does not actually exist), this is rarely a problem at the undergraduate and master's level. Some students do collect large samples, but you are far more likely to collect insufficient data than the other way around. Therefore, it can be best to err on the side of caution and collect more data than less. If you are unsure whether your sample size is sufficient, we would recommend discussing this with your supervisor, especially because some statistical tests that you might run on your data (e.g., t-tests, factor analysis) have certain rules-of-thumb when it comes to what are consider the minimum sample required to avoid a Type II error. Remember that during the data collection phase, participants may drop out or you may struggle to recruit as many people to take part in your research as you may have thought would come forward (i.e., this is not uncommon).

Unless your data collection takes place in a single instance (e.g., a single structured observation, a face-to-face survey or structured interview on a single day), we would recommend that you prepare for the **data analysis phase** whilst you are collecting your data, especially if your data collection takes place over a number of weeks. This is a good idea because **(a)** the data analysis phase always takes you longer than you think, and **(b)** a good proportion of the time required is actually in **preparation** of your data and **running** statistical tests (i.e., using a computer program such as SPSS) rather than **interpreting** the data, which is often far less time consuming by comparison.

Having already read STAGE SIX: Setting you research strategy, you should roughly know what statistical tests you will run on your data. If not, you can learn how to do this in STEP ONE: Select the correct statistical tests to run on your data in **STAGE NINE: Data analysis**. We should also suggest that whilst you collect your data, you read up on STEP TWO: Prepare and analyse your data using a relevant statistics package, also in **STAGE NINE: Data analysis**. At this stage, you will only need to learn about **preparing** your data rather than **analysing** it because you cannot do this until you have collected all your data. Also, the statistical tests that you think you need to run on your data may change when you actually come to analysing it, which is something you'll also be able to learn about in STAGE NINE: Data analysis. Preparing your data is mainly about knowing how to enter your data into SPSS (i.e., the statistics package we cover in the Data Analysis part of Lærd Dissertation), setting it up correctly so that you can easily analyse it when it has all been collected. When you are ready, proceed to STAGE NINE: Data analysis.