Introduction to McNemar Test

The McNemar test can be thought of as a repeated measures, or paired sample, version of a chi-square test of independence. It is used to test for a change in proportion between two time points.

Please note that from here on out, “0” indicates a non-event or non-case while “1” indicates event or case, i.e. “0” indicates no disease while “1” indicates disease.

Hypothesis being tested:

  • Ho: Probability of Time_1 (0, 1) = Time_2 (0, 1)
    • Referencing the table below, Time_1 (0, 1) = b and Time_2 (0, 1) = c
  • Ha: Probability of Time_1 (0, 1) ≠ Time_2 (0, 1)

Let’s clarify this a bit more with a table. What the null hypothesis is stating is that there is no difference in the proportion of individuals in cell b and cell c; while the alternative hypothesis is stating that there is a difference.

After Treatment
Healthy (0) Disease (1)
Before Treatment Healthy (0) a b
Disease (1) c d

Note: McNemar tests can only be used for a 2×2 table.


There are three main assumptions:

  • Two categorical variables (before & after) each with 2 groups that are mutually exclusive
  • The group pairs are mutually independent
    • i.e. Before Treatment (Healthy) and After Treatment (Healthy) are assigned group status independently of the other, another way to say this is that the participant can be assigned to one group and not the other. This applies to every group pair.
  • Sample must be random
Data used in this example

Data used in this example is fictitious and can be found on our GitHub, or can be downloaded with the code below. Let’s load in the required libraries and get a general feel for the data.

In this example data, I will be looking to see if the intervention decreased the percentage of individuals that had fair/poor health. Meaning that the hypothesis being tested is that the intervention had a significant effect at decreasing the number of people that reported fair/poor health.

import pandas as pd
import researchpy as rp

df = pd.read_csv("")


rp.summary_cat(df[['fairpoor_t1', 'fairpoor_t2']])


Variable Outcome Count Percent
fairpoor_t1 0 133 66.5
1 67 33.5
fairpoor_t2 0 143 71.5
1 57 28.5

Remember that “1” indicates case while “0” indicates non-case; referencing the example data, 33.5% individuals reported fair/poor health at time 1 and 28.5% individuals reported fair/poor health at time 2. Is this a significant decrease in the percentage of individuals that reported fair/poor health? Let’s find out!


First let’s get a view of the cross tabulation of these two variables, this will help visualize what’s being tested. Quick note, this method from researchpy stores the returned outputs of the cross tabulation table and results table as a tuple. I will store each table as a separate object and then display them individually. Full documentation can be found here.

table, res = rp.crosstab(df['fairpoor_t1'], df['fairpoor_t2'], test= 'mcnemar')



0 1 All
fairpoor_t1 0 92 41 133
1 51 16 67
All 143 57 200


McNemar results
McNemar’s Chi-square ( 1.0) = 1.0870
p-value = 0.2971
Cramer’s phi = 0.0737

Two hundred participants were recruited to take part in an intervention designed to educate them about the benefits of exercise and reducing their overall BMI. McNemar’s test was used to test for a significant difference in proportions, the difference in the proportions between the pre- and post-intervention was not statistically significant, p = n.s. This would suggested that the intervention is not effective.

References: National Institute of Standards and Technology. (2018, November 4). McNemar Test. Retrieved from h