Tuesday, June 30, 2015

A Chi-Square Crime (Part 2)

This is Part II of a post about chi square analysis.  The text is taken from the free on-line lab manual developed for the Introductory Cell Biology course at Cal Poly.  In part I an analogy between chi square analysis and a courtroom was presented.  Below is the language we use to introduce the students to calculating and interpreting chi square values.

Chi Square Test

The chi square test provides a way to determine wether differences between expected and observed values are significant. A Chi square test will help you determine how well your predictions fit your data. We will use the numbers from one of Gregor Mendel's important genetic experiments using peas.  The numbers represent the progeny of pea plants heterozygous at the locus controlling seed shape (R=round, r=wrinkled, parents were Rr).  Mendel predicted a 3:1 ratio of round to wrinkled peas in the next generation. What he saw was 5474 round peas and 1850 wrinkled peas…a 2.9:1 ratio. 2.9:1 is very close to 3:1 but still not exactly the same. We must determine whether the difference between the observed and expected values are significant. The table below illustrates how to calculate a chi-square (Χ2) value:

Table 1: Calculating Chi-Square for Mendel's Data:

Round Wrinkled
Observed (o) 5474 1850
Expected (e) 5493 1831
Deviation (o-e) -19 19
Deviation2 (d2) 361 361
d2/e 0.07 0.20
Χ2 = ∑d2/e=0.27

Notice that the Χ2 value is a function of the difference (the "deviation") between the expected and observed values. So now we have a Χ2 value of 0.27…how does this help us determine if the difference between the observed and expected values is significant or not? we use the Χ2 to determine the probability (the p value).

1) Determine the degrees of freedom (df) for your experiment. This is one less than the number of possible phenotypes in your experiment. In the case of Mendel's cross, there are two phenotypes, round and wrinkled, so the df = 1 (2 possible phenotypes -1).

2) On Table 2, find the row that corresponds to the df for your experiment.

3) Read along this row until you find your Χ2 value. For Mendel's cross remember that Χ2=0.27. This value falls between 0.15 and 0.46 in the row of the table corresponding to df=1.

4) Read up to the top row to find the probability or p value. In our case, p is between 0.70 and 0.50 (0.70 > p > 0.50).

Table 2: Chi-Square Distribution

Probability (p value)
Nonsignificant Significant
df 0.95 0.90 0.80 0.70 0.50 0.30 0.20 0.10 0.05 0.01 0.001
1 0.004 0.02 0.06 0.15 0.46 1.07 1.64 2.71 3.84 6.64 10.83
2 0.10 0.21 0.45 0.71 1.39 2.41 3.22 4.60 5.99 9.21 13.82
3 0.35 0.58 1.01 1.42 2.37 3.66 4.64 6.25 7.82 11.34 16.27
4 0.71 1.06 1.65 2.20 3.36 4.88 5.99 7.78 9.49 13.28 18.47

Since our p value is above 0.05 we DO NOT reject the null hypothesis. In other words, the difference between the observed and expected values in this experiment is nonsignificant. (Notice that on Table 2 all the p values greater than 0.05 are "nonsignificant" and 0.05 and below are "significant".) To come back around to the courtroom example…If you were on the jury and you knew that 50-70% of people in the town had jewel-encrusted Piaget watches, would you find the defendant guilty?

1 comment:

1. I have been working with this chi square through out my engineering whenever statistical analysis were required. This chi square yield accurate or tends to accurate result. Its so useful indeed.