#### Chi Square Test

The chi square test provides a way to determine wether differences between expected and observed values are significant. A Chi square test will help you determine how well your predictions fit your data. We will use the numbers from one of Gregor Mendel's important genetic experiments using peas. The numbers represent the progeny of pea plants heterozygous at the locus controlling seed shape (R=round, r=wrinkled, parents were Rr). Mendel predicted a 3:1 ratio of round to wrinkled peas in the next generation. What he saw was 5474 round peas and 1850 wrinkled peas…a 2.9:1 ratio. 2.9:1 is*very*close to 3:1 but still not exactly the same. We must determine whether the difference between the observed and expected values are significant. The table below illustrates how to calculate a chi-square (Χ

^{2}) value:

**Table 1: Calculating Chi-Square for Mendel's Data:**

Round | Wrinkled | |
---|---|---|

Observed (o) | 5474 | 1850 |

Expected (e) | 5493 | 1831 |

Deviation (o-e) | -19 | 19 |

Deviation^{2} (d^{2}) |
361 | 361 |

d^{2}/e |
0.07 | 0.20 |

Χ^{2} = ∑d^{2}/e=0.27 |

Notice that the Χ

^{2}value is a function of the difference (the "deviation") between the expected and observed values. So now we have a Χ

^{2}value of 0.27…how does this help us determine if the difference between the observed and expected values is significant or not? we use the Χ

^{2}to determine the probability (the

**p value**).

1) Determine the

**degrees of freedom**(df) for your experiment. This is one less than the number of possible phenotypes in your experiment. In the case of Mendel's cross, there are two phenotypes, round and wrinkled, so the df = 1 (2 possible phenotypes -1).

2) On Table 2, find the row that corresponds to the df for your experiment.

3) Read along this row until you find your Χ

^{2}value. For Mendel's cross remember that Χ

^{2}=0.27. This value falls between 0.15 and 0.46 in the row of the table corresponding to df=1.

4) Read up to the top row to find the probability or p value. In our case, p is between 0.70 and 0.50 (0.70 > p > 0.50).

**Table 2: Chi-Square Distribution**

Probability (p value) |
|||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

Nonsignificant | Significant | ||||||||||

df | 0.95 | 0.90 | 0.80 | 0.70 | 0.50 | 0.30 | 0.20 | 0.10 | 0.05 | 0.01 | 0.001 |

1 | 0.004 | 0.02 | 0.06 | 0.15 | 0.46 | 1.07 | 1.64 | 2.71 | 3.84 | 6.64 | 10.83 |

2 | 0.10 | 0.21 | 0.45 | 0.71 | 1.39 | 2.41 | 3.22 | 4.60 | 5.99 | 9.21 | 13.82 |

3 | 0.35 | 0.58 | 1.01 | 1.42 | 2.37 | 3.66 | 4.64 | 6.25 | 7.82 | 11.34 | 16.27 |

4 | 0.71 | 1.06 | 1.65 | 2.20 | 3.36 | 4.88 | 5.99 | 7.78 | 9.49 | 13.28 | 18.47 |

Since our p value is above 0.05 we DO NOT reject the null hypothesis. In other words, the difference between the observed and expected values in this experiment is nonsignificant. (Notice that on Table 2 all the p values greater than 0.05 are "nonsignificant" and 0.05 and below are "significant".) To come back around to the courtroom example…If you were on the jury and you knew that 50-70% of people in the town had jewel-encrusted Piaget watches, would

*you*find the defendant guilty?

I have been working with this chi square through out my engineering whenever statistical analysis were required. This chi square yield accurate or tends to accurate result. Its so useful indeed.

ReplyDelete