Repeated-measures ANOVA

- Additional Topic: Sphericity

Also known as within-subjects design, theses tests are used when each subject is measured multiple times. Different treatments may applied to each subject over time, or to groups of subjects in a uniform way. Similar to paired t-tests, these tests increase the power of the analysis by accounting for the idiosyncratic differences between subjects.

The following conditions make a study appropriate for repeated-measures ANOVA:

Questions which might be suitable for this type of analysis include: Does an experimental diet lead to better test performance of two groups of study animals? Which medium leads to the most proliferation in several cell lines over time? Do subjects improve their balance over time when given a sequence of experimental treatments?

Here we will use a real data set to ask whether different concentrations of a tree bark extract lead to different survival rates of termites. These data can be used to see if the tree bark compound would be suitable for development as an anti-termite treatment.

Open Termites.xls (see the Data Appendix). This study has a "mixed design" or "two-way design with one repeated measure" in the terminology of Portney & Watkins, with two treatment levels applied to different blocks of subjects, and many measurements in time for each subject.

Go to Analyze > General Linear Models > Repeated Measures. The first dialog requires you to "define factors". Here we need to make a name two new objects, the Within-Subject Factor Name, which you can name by what is actually being assessed at each measure. In this case, it is the number of termites surviving. There are 13 measures in our data set (they skipped days 3 and 9). Second, you need to type in the Measure Name. This should just be the time units for the repeated measures, which in this case is day. Type in each, making sure to click "Add", then choose Define.

The next dialog shows all the 13 levels of the "survival" factor, named as "day". We want to match these up with the 13 columns of measurements we have. Select day1 to day15, and click the arrorw to move them into the Within-Subjects box. Then move dose into the Between-Subjects Factors box.

It should now look like this:

In this example, we only have two doses. If we had more levels in this factor, we would want to examine the differences between each category using the Post Hoc dialog (Tukey).

To create a graph of the results, click Plots. Move dose (or whatever between-subjects factor you have) into the Seperarte Lines box, and survival into the Horizontal Axis box.

Finally, choose Options, and at least click the Estimates of effect size and Homogeneity tests boxes.

Choose Continue, and then OK to run the test.

Before looking at the results, it is necessary to digress briefly to discuss the concept of sphericity.

Sphericity In other parametric tests, we have been concerned with the normal distribution of data and homogeneity of variances. In a repeated-measures design, we are also concerned with equal correlations between the data at different time points; this is known in statistics as sphericity. This assumption considers the covariance between measurements.

If the sphericity assumption is violated, the chance of a Type I error (incorrectly rejecting the null hypothesis of no difference between groups) increases. This is a troubling outcome, and unfortunately difficult to resolve.

Alternatives include multivariate analyses of variance (MANOVA), which do not require sphericity. SPSS runs a MANOVA by default for a repeated-measures ANOVA, with the results in the Multivariate Tests table. There is rarely any major difference between them in terms of significance values, but if necessary to choose the appropriate test, consult a specialized text on multivariate statistics (e.g., Manly 2005).

SPSS performs two tests related to sphericity, Box's Test for Equality of Covariance Matrices and Mauchly's Test of Sphericity. Portney & Watkins provide a succinct description of Mauchly's test (p. 447).

If the result of the Mauchly test is significant (p 0.05), there is a significant violation of the assumption of sphericity. Therefore, we should correct the degrees of freedom when performing the ANOVA; SPSS does this automatically and notes it in a footnote beneath the Mauchly test table. The correction is called epsilon.

SPSS reports all possible significance values, using the different epsilon corrections. Here are the meanings of each of these:

Returning to the model results, we first see the multivariate analysis of variance tests. These test the effect of the within-subject factor, survival, as if each measurement were a different variable; that is what makes this a multivariate test. The different flavors of MANOVA are all identical here, showing a significant effect of the day measured; this is not interesting or surprising, since we expect that termites will start dying off in the petri dishes quite naturally.

However, the next set of values, survival x dose, show no effect. This indicates that the survival of termites did not differ depending on the concentration of tree bark extract. This indicates that the tree bark extract would not be useful as an anti-termite treatment. But this result should be treated very cautiously, since the multivariate test is less powerful than a repeated-measure ANOVA.

Multivariate Tests(b)

Effect

 

Value

F

Hypothesis df

Error df

Sig.

Partial Eta Squared

survival

Pillai's Trace

.979

11.535(a)

12.00

3.00

.034

.979

 

Wilks' Lambda

.021

11.535(a)

12.00

3.00

.034

.979

 

Hotelling's Trace

46.139

11.535(a)

12.00

3.00

.034

.979

 

Roy's Largest Root

46.139

11.535(a)

12.00

3.00

.034

.979

survival * dose

Pillai's Trace

.737

.699(a)

12.00

3.00

.717

.737

 

Wilks' Lambda

.263

.699(a)

12.00

3.00

.717

.737

 

Hotelling's Trace

2.798

.699(a)

12.00

3.00

.717

.737

 

Roy's Largest Root

2.798

.699(a)

12.00

3.00

.717

.737

a  Exact statistic

b  Design: Intercept+dose

 Within Subjects Design: survival

Next comes the results for the repeated-measures ANOVA. This requires that the covariance matrix of the data have "sphericity", as explained above. These data definitely do not; the covariances differ at different points in the experiment.

Mauchly's Test of Sphericity(b)

Measure: day

Within Subjects Effect

Mauchly's W

Approx. Chi-Square

Df

Sig.

Epsilon(a)

         

Greenhouse-Geisser

Huynh-Feldt

Lower-bound

survival

.000

233.195

77

.000

.172

.216

.083

Tests the null hypothesis that the error covariance matrix of the orthonormalized transformed dependent variables is proportional to an identity matrix.

a  May be used to adjust the degrees of freedom for the averaged tests of significance. Corrected tests are displayed in the Tests of Within-Subjects Effects table.

b  Design: Intercept+dose

 Within Subjects Design: survival

Therefore, when we look below at the within-subject effects, we will look at them in the following order:

  1. Examine the results when sphericity is assumed;
  2. Since we know that the data have failed the sphericity test, look at the results after the Greenhouse-Geisser correction has been applied.
  3. If these agree, we are done. If they disagree, and the G-G results show no effect but the sphericity assumed results do show an effect, look at the Huynh-Feldt corrected results. This will be our final answer.

Tests of Within-Subjects Effects

Measure: day

Source

 

Type III Sum of Squares

df

Mean Square

F

Sig.

Partial Eta Squared

survival

Sphericity Assumed

7130.356

12

594.196

112.924

.000

.890

 

Greenhouse-Geisser

7130.356

2.059

3463.254

112.924

.000

.890

 

Huynh-Feldt

7130.356

2.591

2751.769

112.924

.000

.890

 

Lower-bound

7130.356

1.000

7130.356

112.924

.000

.890

survival * dose

Sphericity Assumed

561.952

12

46.829

8.900

.000

.389

 

Greenhouse-Geisser

561.952

2.059

272.943

8.900

.001

.389

 

Huynh-Feldt

561.952

2.591

216.870

8.900

.000

.389

 

Lower-bound

561.952

1.000

561.952

8.900

.010

.389

Error(survival)

Sphericity Assumed

884.000

168

5.262

     
 

Greenhouse-Geisser

884.000

28.824

30.669

     
 

Huynh-Feldt

884.000

36.277

24.368

     
 

Lower-bound

884.000

14.000

63.143

     

Notice that the F-ratios for all of the test are the same for the two groups. Even though the sphericity assumption has not been supported, the corrections applied do not change the final story.

In particular, both "survival" (the day of measurement) and the interaction between day and dose are highly significant explanatory factors of the termite numbers. This differs from the MANOVA results, and since this is a more powerful test, we should focus just on the repeated-measures. The tree bark extract does have an effective anti-termite compound.

Finally, examine the profile plot. This immediately explains the results: the higher dose of tree bark extract led to significantly lower termite suvival.

Other options

In the Repeated Measures dialog box, if you have multiple explanatory factors, you can choose which interactions to include in the model using the Model option.

This dialog also gives you the option to choose which type of sums of squares to use. This is a complex topic, but essentially, if the cell frequencies in of the between-subject factors are unbalanced (i.e., the values between the different treatments are unequal), Type IV sums of squares is recommended.

Additinonally, there are other procedures which can accomplish appropriate analysis.

Previous: Nonparametric Anova   Next: Correlation