Introduction to Data Analysis and R: Hypothesis Testing

Hypothesis Testing

We need to start our learning of data analysis by making sure we have an understanding of hypothesis testing.

The main goals of statistical analyses are to describe your data (descriptive stats) and draw conclusions about what they show (inferential stats). However, before you can draw conclusions, you first have to determine what you're looking for, you need to outline your hypothesis. After all, if you don't know your hypothesis, how will you know whether your data support it?

Hypotheses generally come in pairs (null and alternate), and these pairs have two forms (research and statistical). Your null and alternate research hypotheses will inform your null and alternate statistical hypotheses, and it is the latter pair that are the focus of any statistical analyses you run.

Research hypotheses

Your research hypotheses typically take the form of a pair of statements that describe in general terms what you will be testing. For example, let's say you're studying student learning. You want to see whether a group of student who receive additional writing instruction achieve higher scores on their final papers than a different group of students who receive no additional instruction.

Your null hypothesis takes the stance that there will be no difference, and your alternate hypothesis takes the stance that there will be a difference. Your null and alternate research hypotheses, therefore, could be something like this:

Null: Students receiving additional writing instruction perform no better or worse than students who don't.

Alternate: Students receiving additional writing instruction do perform better than students who don't.

Statistical hypotheses

Your statistical hypotheses restate your research hypotheses in more specific terms, typically detailing in some way the measurements you'll be taking and how your groups should relate to each other:

Null: Final paper scores of students who receive additional writing instruction are no different, on average, than the final paper scores of students who do not receive additional instruction.

Alternate: Final paper scores of students who receive additional writing instruction are different, on average, than the final paper scores of students who do not receive additional instruction.

You may have noticed that my alternate research hypothesis states that expect to see an increase in student performance, while my alternate statistical hypothesis just states that I expect to see a difference. This is an important distinction: even if you expect to see a change in one particular direction, as a general rule you should never define one direction in your statistical hypothesis unless the other direction is literally impossible. We'll cover this in more detail later when we talk about one- and two-tailed analyses.

Supplemental Reading

Basic concepts of hypothesis testing (McDonald 2014)