If you encounter any issues or errors in this module, please let me know using this feedback form:
The t-test is one of the most commonly encountered inferential analyses. It has a few different forms that you will need to be aware of.
The Student's t-test for one sample is used when you want to compare the mean from one measurement variable to a theoretical mean based on your expectations under the null hypothesis. The statistical null hypothesis is that the mean of the measurement variable is equal to the theoretical or hypothesized mean (a number decided upon before conducting the experiment). Please note: this sort of comparison (actual results to theoretical results) is not encountered as often as either a two-sample or paired t-test, and should not be your first choice of analysis.
As an example, you could use a one-sample t-test to compare the heights of a set of adults to the mean height of adults as reported by the CDC. Or, you could compare the actual waiting times for patients at a doctor's office to a hypothesized waiting time of one hour.
The Student's t-test for two samples is what most people mean when they talk about performing a t-test. It is an appropriate test when you have one measurement variable and one nominal variable with only two values (e.g. male or female, positive or negative), and you want to compare the mean values of the measurement variable. The statistical null hypothesis, then, is that the means of the measurement variable are equal for each category of the nominal variable.
As an example, a two-sample t-test would be appropriate if you want to compare overall test scores for a class of students separated into two different groups (a control and a treatment group). In this case, test score is the measurement variable, and the group number/category would be the nominal variable.
Note: the Student's t-test assumes equal variance between the two different groups. There is also the commonly encountered Welch's t-test, which assumes unequal variance. We will be using the former in our examples. When in doubt about which one to use, consult a statistician.
The paired t-test is used when you have one measurement variable and two nominal variables, when one of the nominal variables has only two values and you have only one observation for each combination of nominal variables. Basically, the observations are paired. Datasets that can be analyzed using a paired t-test often look like they have two measurement variables, i.e., two columns of measurement data (for which linear regression and correlation would be more appropriate), but don't be confused. If both columns are measuring the same thing, just under different conditions, you likely should be using a paired t-test.
You typically encounter this test in experiments where one of the nominal variables represents an individual and the other represents pre- and post-treatment. The statistical null hypothesis for this test is that the mean difference between the paired observations is zero.
As an example, let's say you're testing insulin production in mice before and after injection of a new drug. Since each mouse's insulin production is measured twice (a pre- and a post-test, in a sense), the observations are paired. Insulin production is the measurement variable, one of the nominal variables is the mice being tested, and the other nominal variable is the drug treatment (which has only two values, i.e., before and after).
You may have read scientific papers that mention using either a one-tailed or a two-tailed t-test. You may also remember a section covering tailed probabilities in the reading on hypothesis testing from earlier in this module. To summarize, one-tailed probabilities focus only on one side of the distribution, while two-tailed probabilities incorporate both sides. They each have their uses, but in reality, you will very rarely encounter a situation or experimental design that would require you to perform a one-tailed analysis.
As an example, let's say you were interested in seeing if potatoes grown in one type of soil (soil 1) weigh more than potatoes grown in a different type of soil (soil 2). Even though we are looking to see if the mean weight increases for those potatoes grown in soil 1 (which would shift the distribution to one side), what happens if the opposite ends up being true? It's entirely possible that potatoes grown in soil 1 would end up weighing less than potatoes grown in soil 2. If we used a one-tailed analysis, we are excluding this possibility, which may very well lead to our getting a false positive. So, even if a result in one direction is unlikely, we still should use a two-tailed analysis. One-tailed analyses should really only be used if results in one direction are logically impossible.
For an example of when it might actually be appropriate to use a one-tailed analysis, consider the following excerpt:
"[W]e may wish to test whether a caterpillar feeding on a leaf for only one hour is going to remove a statistically significant amount of leaf area. There is obviously no way it can make leaves bigger! So if the mean leaf area of leaves with a caterpillar is larger than the area of leaves without caterpillars, the most that can have happened (whatever the size of the difference resulting from the chances of sampling) is that the caterpillars have not eaten away a leaf area. The difference (however big) must be a bad estimate of a true zero difference. Only a leaf area difference 'eaten' is less than 'uneaten' can be real! 'Uneaten' less than 'eaten' has to be nonsense!" (van Emden, 2008, p. 86)
Essentially, unless you are absolutely certain that you should be using a one-tailed analysis, your default should be to use a two-tailed test. As such, this is what the tests in this module use.
Before we look at how to run t-tests in R, we need to know what we're looking for. In general, you will see three new values being calculated that you should report in some way: the test statistic (which for a t-test is, appropriately, called a t-statistic), the degrees of freedom, and the p-value. Means and confidence intervals are also automatically calculated in RStudio, and you will likely need to report these in some way as well.
We will be using some new datasets (linked below table) for the examples in the video.
t-test | Dataset | Description |
---|---|---|
one-sample | bodyMeasures | Measurements of male student heights and weights. We will be using average measurements from the CDC (69.3 inches at the time of this writing) to make our comparisons. |
two-sample | paperScores | Examination of student writing skills. Students were separated into either a treatment or control group. Data include final paper score as well as total references used. |
paired | testScores | Examination of student learning before and after treatment. Data include pre-test and post-test scores for each student. |
(Don't hesitate to use the player controls to pause, rewind, slow down the video as needed! A thorough understanding of the concepts is vastly preferable to just speeding through.)
Here's a quick reference table for remembering which arguments you should use for each type of t-test.
t-test | Arguments | Example |
---|---|---|
one-sample |
x, for measurement variable mu, for theoretical mean |
t.test( x = bodyMeasures$height, mu = 69.3 ) |
two-sample |
formula, for defining the variables you'll be using data, for specifying the dataframe var.equal, must equal TRUE for a Student's t-test (as opposed to a Welch's t-test) |
t.test( formula = finalScore ~ treatment, data = paperScores, var.equal = TRUE ) |
paired |
x, for measurements for one of the paired observations y, for measurements for the other of the paired observations paired, must equal TRUE for paired t-test |
t.test( x = testScores$preTest, y = testScores$postTest, paired = TRUE ) |
This quiz will ask you to perform the three different types of t-tests we've gone over using the three datasets linked below.