Phi Coefficient: Definition & Examples


Phi Coefficient (sometimes called a mean square contingency coefficient) is a measure of the association between two binary variables.

For a given 2×2 table for two random variables and y:

The Phi Coefficient can be calculated as:

Φ = (AD-BC) / √(A+B)(C+D)(A+C)(B+D)

Example: Calculating a Phi Coefficient

Suppose we want to know whether or not gender is associated with political party preference. We take a simple random sample of 25 voters and survey them on their political party preference. The following table shows the results of the survey:

Phi Coefficient example calculation

We can calculate the Phi Coefficient between the two variables as:

Φ = (4*4-9*8) / √(4+9)(8+4)(4+8)(9+4) = (16-72) / √24336 = -0.3589

Note: We could have also calculated this using the Phi Coefficient Calculator.

How to Interpret a Phi Coefficient

Similar to a Pearson Correlation Coefficient, a Phi Coefficient takes on values between -1 and 1 where:

  • -1 indicates a perfectly negative relationship between the two variables.
  • 0 indicates no association between the two variables.
  • 1 indicates a perfectly positive relationship between the two variables.

In general, the further away a Phi Coefficient is from zero, the stronger the relationship between the two variables.

In other words, the further away a Phi Coefficient is from zero, the more evidence there is for some type of systematic pattern between the two variables.

Additional Resources

A Guide to the Pearson Correlation Coefficient
A Guide to Fisher’s Exact Test
A Guide to the Chi-Square Test of Independence

3 Replies to “Phi Coefficient: Definition & Examples”

  1. Didn’t you skew your survey by choosing an uneven number of participants. That eliminates the possibility of a 0 solution doesn’t it?

    1. Absolutely, great catch—and you’re thinking along the right lines.

      The phi coefficient is a number we use to see if there’s a relationship between two things that can each be yes or no—like whether someone owns a dog (yes or no) and whether they prefer the outdoors (yes or no). The phi value ranges from -1 (a perfect negative link) to 0 (no link) to +1 (a perfect positive link).

      Now, your question was about having an uneven number of participants in a survey. Does that mess things up? Not exactly—but it can make things a little tricky.

      You can still get a phi value of zero, which means no association, even with an odd number of people. What really matters is how the answers are spread out across the two categories for each variable. If the responses are balanced—like close to half answering yes and half no for both questions—then it’s easier for the math to work out to zero.

      But if most people say yes to one thing and very few say yes to the other, the table of results gets lopsided. That kind of imbalance can make it harder to land on a phi of zero, even if there’s no real connection between the two questions. So, in that sense, yes—the way your survey is set up and who you include in it can influence the results.

      In short, an odd number of participants doesn’t automatically rule out getting a phi of zero. But if the way responses are distributed is very uneven, it might make it harder to show there’s no relationship—just like you said.

Leave a Reply