What is it?

The paired sample t-test is also called dependent sample t-test. It’s an univariate test that tests for a significant difference between 2 related variables. An example of this is if you where to collect the blood pressure for an individual before and after some treatment, condition, or time point.

The hypothesis being test is:

  • Null hypothesis (H0): ud = 0, which translates to the mean difference between sample 1 and sample 2 is equal to 0.
  • Alternative hypothesis (HA): ud ≠ 0, which translates to the mean difference between sample 1 and sample 2 is not equal to 0.

If the p-value is less than what is tested at, most commonly 0.05, one can reject the null hypothesis.

Paired Sample t-test Assumptions

In order for the paired sample t-test results to be trusted, the following assumptions need to be met:

  • The dependent variable (DV) must be continuous which is measured on an interval or ratio scale
  • The DV should be approximately normally distributed
    • Testing for normality needs to be conducted on the differences between the two conditions, not the raw values of each condition itself
    • The paired sample t-test is robust to this violation. If there is a violation of normality, as long as it’s not in a major violation the test results can be considered valid
  • The DV should not contain any significant outliers

If any of these assumptions are violated, a different test should be used. An alternative to the paired sample t-test is the Wilcoxon signed-rank Test.

Data used in this example

The data used in this example can be found on our GitHub page. The data set is fictitious and contains blood pressure readings before and after an intervention. These are variables “bp_before” and “bp_after”.

Let’s import pandas as pd, the data, and then take a look at the data!

import pandas as pd

df = pd.read_csv("blood_pressure.csv")



bp_before bp_after
count 120.00 120.00
mean 156.450000 151.358333
std 11.389845 14.177622
min 138.000000 125.000000
25% 147.000000 140.750000
50% 154.500000 149.500000
75% 164.000000 161.000000
max 185.000000 185.000000

Checking the Assumptions

Assumption Check: Outliers

First thing we need to do is import the stats library and then test the assumptions of the paired samples t-test. First let’s check for any significant outliers in each of the variables.

from scipy import stats
import matplotlib.pyplot as plt

df[['bp_before', 'bp_after']].plot(kind='box')
# This saves the plot as a png file

python t-test paired sample pandas

There doesn’t appear to be any significant outliers in the variables.

Assumption Check: Normal Distribution

Remember that for the dependent sample T-test the normality check needs to be conducted on differences between the two scores. There are a few ways one can test this assumption – make a histogram, use a Q-Q plot, and/or use a statistical test. Let’s create a variable for the differences and run through these.

df['bp_difference'] = df['bp_before'] - df['bp_after']

df['bp_difference'].plot(kind='hist', title= 'Blood Pressure Difference Histogram')
#Again, this saves the plot as a png file
plt.savefig('blood pressure difference histogram.png')

python pandas t-test t test dependent samples repeated measure measures

The histogram of our data seems to be normally distributed. Another way to check for normally distributed data is to use a Q-Q plot. If you’re unfamiliar with how to read a Q-Q plot, the data should be on the red line. If it’s not, then it suggests that the data may not be normally distributed.

stats.probplot(df['bp_difference'], plot= plt)
plt.title('Blood pressure Difference Q-Q Plot')
plt.savefig('blood pressure difference qq plot.png')

python pandas q-q plot qq q q test for normality paired samples t-test t test dependent sample repeated measure measures

There is some deviation from normality, but it does not appear to be severe so there is no need to worry. To be sure, let’s test this statistically to see if the data is normally distributed. To test this, one can use the Shapiro-Wilk test for normality. Unfortunately the output is not labeled. The first value is the W test value, and the second value it the p-value.

(0.9926842451095581, 0.7841846942901611)


The test was non-significant. Therefore, the difference between the two conditions is normally distributed. If this test were to be significant, an appropriate alternative to use would be the Wilcoxon signed-rank Test.

Paired Samples T-Test Example

To conduct the paired sample t-test, one needs to use the stats.ttest_rel() method.

stats.ttest_rel(df['bp_before'], df['bp_after'])
Ttest_relResult(statistic=3.3371870510833657, pvalue=0.0011297914644840823)


The findings are statistically significant! One can reject the null hypothesis in support of the alternative.

Another component needed to report the findings is the degrees of freedom (df). This can be calculated by taking the total number of paired observations and subtracting 1. In our case, df = 120 – 1 = 119.

Interpretation of the Results

A paired sample t-test was used to analyze the blood pressure before and after the intervention to test if the intervention had a significant affect on the blood pressure. The blood pressure before the intervention was higher (156.45 ± 11.39 units) compared to the blood pressure post intervention (151.36 ± 14.18 units); there was a statistically significant decrease in blood pressure (t(119)=3.34, p= 0.0011) of 5.09 units.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.