Guide 12
New
Guide 12
Using the Chi-Squared Test
An in-depth look at one of the most useful inferential statistical tests you can use
Trigger
Use when planning to collect categorical data or when conducting inferential statistics on that categorical data
Part 1
A Brief Introduction to the Chi-Square Test

The chi-squared test (pronounced “kai”) is a statistical test for categorical data. There are three variations of the chi-squared test: the association test, the goodness-of-fit test, and the homogeneity test. You can see some basic facts about the test below and its variations later in this guide.

The basic formula for the chi-squared formula is broken down below, alongside some basic facts about the test in general.

  • The chi-squared (pronounced  uses two categorical variables (can be nominal or ordinal) → See Collection 3, Handbook 3, Topic 4 for more
  • It’s a nonparametric, inferential test → see Collection 5, Handbook 2, Topic 3 for more
  • Chi-squared statistics (or the value generated from the test) are always positive
  • The larger the chi-squared statistic, the stronger the evidence that there’s a relationship between the two categorical variables being assessed
  • The test works really well for sample sizes between 50 – 500 participants.
  • The chi-squared can only tell you whether or not two categorical variables are associated, but not how strong that association – if there is one – actually is (for the strength of that association, jump to step 4 in this guide)
Part 2
Choosing a Chi-Squared Variation

Use this decision guide to narrow down which variation of chi-squared test you should consider using.

Decision 1

Are you trying to assess the relationship between two categorical variables?

  • Yes → continue
  • No → see this resource for other statistical methods you can use

Decision 2

Did you get this categorical data using random sampling?

  • Yes → continue
  • No → use another statistical method on categorical data

Decision 3

Do you have at least three people in each cell of a contingency table between the categorical variables?

  • Yes → continue
  • No → avoid using the chi-squared test

Decision 4

Do you have less than 1,000 participants total?

  • Yes → continue
  • No → use the chi-squared test with caution

NOTE: You can use the chi-square with larger sample sizes. However, it’s a known issue that chi-square tests on large sample sizes (think 1,000+) can automatically lead to statistically significant effects. This can lead to a biased or skewed understanding of the association between the two categorical variables.

Decision 5

Do you want to compare your current dataset with any past data?

  • Yes → Use the Chi-Squared Goodness-of-Fit Test (see this link for help)
  • No → continue

Tip: You can use the chi-square goodness-of-fit test as a way to assess sample representativeness. Make a contingency table with the relevant population characteristics (such as gender ratios, locations, or device preferences) and your sample values. Use the chi-squared goodness-of-fit test on that table. Statistically significant results imply that your sample is *not* representative of your population.

Decision 6

Are you comparing two different populations?

  • Yes → continue
  • No → Use the Chi-Square Test of Association

Decision 7

Did you independently sample from both populations? (meaning sampling from one population didn’t affect the chances someone is selected to participate in the other population)

  • Yes → Use the Chi-Squared Test of Homogeneity (see this link for help)
  • No → use another statistical method
Part 3
Using the Chi-Square Association Test (with Example)

Example Scenario: Smartwatch Feature Prioritization

Background
  • You’re a tenured researcher at a large tech company. One product the company makes is a smart fitness tracker watch. Your team handles the software experience for the smartwatch. Right now, your team is trying to prioritize their backlog of features. There are four key features that all positively impact the team’s metrics. These features are auto-workout detection, blood oxygen monitoring, social challenges, and long-term trend analysis.
  • The team believes that while all four features are helpful, the type of activity someone does regularly will influence which feature they want the most. They’re interested in building more features for basketball because basketball athletes upgrade to premium plans more often than the other sports. You operationalize this into the following hypotheses:
  • Null Hypothesis (H0): Activity type and desired feature are independent variables.
  • Alternate Hypothesis (HA): Activity type and desired feature are not independent variables.
Approach
  • You design and test an 8-item survey to better understand what features current athletes want.
  • You randomly sample 400 respondents, keeping the survey open until you have 100 responses for the four biggest activity types in each activity type (aka basketball, swimming, running,& walking in this example).
Analysis
  • You’ve started your analysis and are about to run a chi-square test to test the above hypotheses. You then create a contingency table (see example data below).

Step 1

Calculate the chi-squared statistic

Use this online calculator for contingency tables, up to five rows and five columns. You can automatically calculate the chi-squared statistic and the p-value.

Scenario Chi-Square Statistic = 68.67

Scenario P-value** = 1 x 10 -5 (or .00001)

Step 2

Calculate the degrees of freedom

Calculate the degrees of freedom (df) by following this formula, using the rows and columns in your starting contingency table: (Number of rows minus one) x (number of columns minus one)

You can read more about degrees of freedom with this link or this other link.

Scenario Degrees of Freedom (df) =  (4 rows – 1) x (4 columns – 1) = 3 x 3 = 9 degrees of freedom

Step 3

Look up the critical value

Look up the chi-square distribution table (you can use this link) to look up the value that corresponds to your degrees of freedom (df) and alpha-level. The alpha level is the number/value that you compare your p-value against to help determine statistical significance. Commonly, the alpha level is 0.05 (as shown in this example).

Scenario Critical Value = 16.92 (9 df at .05 alpha level)

Step 4

Compare your chi-square statistic against the critical value

  • Chi-squared statistic > critical = evidence against null hypothesis
  • Chi-squared statistic < critical = evidence in support of the null hypothesis

Scenario Comparison = 68.67 (Chi-Square statistic) > 16.92 (Critical Value)

Step 5

Compare your p-value against the alpha level

  • p-value < 0.05 = evidence to reject null hypothesis (aka statistical significance)
  • p-value > 0.05 = evidence in support of null hypothesis

Scenario p-value < Alpha-level 1 x 10-5(or .00001) < .05, meaning results were statistically significant

If you have statistically significant results, then you have evidence that the two variables are associated. To determine the association's strength, you can calculate the effect size using the Cramer’s V test (step 6 below).

Cramer's V Test Formula

If you didn’t have statistically significant results, skip steps 6 and 7.

Step 6

Calculate effect size (if results with statistically significant) and interpret it

Effect sizes tell you the size/magnitude of the differences between groups or variables. If you see a large effect size, then you can assume that your two categorical variables are strongly associated. You learn more in this video.

You can use this online calculator to use the Cramer’s V test.

Scenario Cramer’s V statistic = 0.24

Step 7

Compare your effect size with benchmarks

Interpret your effect size using the diagram below:

Effect sizes are an area of hot debate in statistics. The common understanding is that effect sizes tend to be small in most studies. You can read more here.

Scenario: With an effect size of 0.24, there's evidence of a moderately strong association between activity type and desired feature.

Step 8

Interpret all your results and arrive at a conclusion

Ask yourself the following questions to see if you have enough evidence to make a claim about the association between your two categorical variables:

  • Did you meet all the chi-square test requirements?
  • Did you have fewer than 1,000 total participants?
  • Does your study design lower/limit bias as much as possible?
  • Was your chi-square statistic larger than your critical value?
  • Was your p-value smaller than your alpha level?
  • Was your effect size moderate or large?
  • Is the association between variables practical or plausible in the real world?
  • Can the business build better products if the association were real?

If you answered yes to all of these, you’d likely have more than enough evidence to claim the relationship between the two variables.