extinctions

Incidences of Mass Extinction Events:
An Example of the Chi-Squared Goodness of Fit Test
October 21, 2002

The data used here were taken from http://www.zoology.ubc.ca/ schluter/bio300www/poissonnotes.html. Quoting from that web page:

The main use of the Poisson distribution is in providing us with a null hypothesis for events in space or time. Deviations from this hypothesis tell us something about real processes in nature. For example, consider the distribution of extinction events over intervals of time across millions of years of fossil history. The best record of extinctions through earth's history come from fossil marine invertebrates, because they have shells and preserve well. Below I list the number of recorded extinctions of marine invertebrate families (a high-level taxonomic category) in 76 intervals of time, ...

The question of interest is whether the pattern of extinction events through the fossil record is "random" in time, or whether instead extinctions tend to be clumped and occur in bursts ("mass extinctions"). Alternatively, extinctions may occur in a dispersed pattern. The easiest way to test this is to compare the frequency distribution of extinctions to that expected from a Poisson distribution using a chi-square goodness of fit test. Our hypotheses are:

Ho: The number of extinctions per time interval has a Poisson distribution Ha: The number of extinctions per time interval does NOT have a Poisson distribution.

In Table 1 we show the data. Because several of the categories have expected frequency less than 5, we grouped the first two categories (corresponding to time periods with 0 and 1 extinction events) and the last 13 categories (corresponding to time periods with 8 or more extinction events). The grouped data are shown in Table 2.

To test the goodness of fit of the Poisson model (which is that extinction events occur completely at random), we need to estimate the parameter, which is the mean number of events during a time interval. The Maximum Likelihood Estimate (MLE) happens to be the sample mean. This value is $\hat{\lambda}$ . The expected frequencies were computed with this value. With the grouped data, we computed the $\chi ^2$ value of 23.94996. The number of degrees of freedom is , where is the number of categories, and is the number of parameters estimated. The critical value is $\chi^2 (0.05,6) = 12.59159$ . Since our observed value exceeds this, we conclude that the observed data do differ significantly from the Poisson distribution. Examining the Z-scores in Table 2, we note that there are two values outside the range 1.960: there are ``too many'' time intervals with 0 to 1 extinction events () and ``too few'' time intervals with 4 events (). There is also a ``suggestion'' of ``too many'' intervals with 8 or more events, although we would not consider significant. The results are rather intriguing. The results for 0 to 1 and for 8 and above suggest that extinction events are ``clumping together,'' but the result for periods with 4 events is perplexing.

Table 1: The raw data on the numbers of extinction events during 76 time periods. The first column is for the number of extinction events in a time interval. The second column shows the observed number of intervals with that many extinction events. The third column shows the expected number of such events under the Poisson model with the parameter estimated by maximum likelihood. In the expected frequency for 20 events, we actually computed the probability of 20 or more events (instead of exactly 20), but it is 0.000.

Number of	Observed	Expected
Events	Frequency	Frequency
0	0	1.128
1	13	4.748
2	15	9.997
3	16	14.030
4	7	14.769
5	10	12.437
6	4	8.728
7	2	5.250
8	1	2.763
9	2	1.293
10	1	0.544
11	1	0.208
12	0	0.073
13	0	0.024
14	1	0.007
15	0	0.002
16	2	0.001
17	0	0.000
18	0	0.000
19	0	0.000
20	1	0.000
Total	76	76

Table 2: Grouped data from Table 1. The first column is for the number of extinction events in a time interval. The second column shows the observed number of intervals with the extinction events in the range of column one. The third column shows the expected number of such events under the Poisson model with the parameter estimated by maximum likelihood. Note that we have combined 0 and 1 (the first two rows) from Table 1, and also all the rows for 8 and above. The fourth column is the Z-score. The entry under Total (the last row) in column 4 (the entry with the asterisk) is the sum of squares of the Z-scores, which is the observed value of the $\chi ^2$ test statistic.

Grouped No.	Observed	Expected	Z-score
of Events	Frequency	Frequency
0 to 1	13	5.876	2.9388879
2	15	9.997	1.5823249
3	16	14.030	0.5259414
4	7	14.769	-2.0215737
5	10	12.437	-0.6910313
6	4	8.728	-1.6003689
7	2	5.250	-1.4184163
8 & above	9	4.915	1.8425967
Total	76	76	23.94996*

About this document ...

Next: About this document ...

Dennis Cox 2002-10-21