Which results of the standard test for community weighted mean approach are too optimistic?

David Zelený

The relationship between the average height of understory herbs (CWM of species height) and the amount of light passing through the forest canopy (quantified by visually estimated cover of trees and shrubs at each site). If the analysis based on CWM approach was done by a standard parametric test, the result is highly significant (P<0.001). However, the result of the standard test is overly optimistic; the max test, which is the correct test for this analysis, returns considerably less exciting P-value (P=0.052).

Imagine a forest, with trees forming a canopy and herbs growing in the understory. Amount of light passing through the canopy is limited, and herbs may need to compete for light by growing taller. One may ask: is the height of plants in the understory related to the amount of light passing through the canopy? Such a question is an example of a common task in community ecology: relating the attributes of species in the community (e.g. plant height) to the attributes of sites at which the community occurs (e.g. the amount of light). One way is to use community weighted mean (CWM) approach, in which values of species attributes are averaged across species in the community occurring at the site (weighted or not by species abundance), and these means are related to site attributes, e.g. by correlation or regression. In the forest example above, this equals to averaging heights of species present at each site and relating this mean to the amount of light passing through the canopy (see the figure above).

Illustration photo: light passing through the canopy to the forest understory. Temperate deciduous oak-hornbeam forest at Zadní Hády reserve, Brno, Czech Republic (May 23, 2010). Photo credit: David Zelený.

Although CWM approach was and still is widely used in a broad range of ecological disciplines, just relatively recently it became clear that it suffers from a statistical problem. If tested by standard parametric (or analogous permutation) tests, CWM approach returns results that tend to be overly optimistic, i.e. more significant than is warranted by data. Sounds like not good news: scientific literature is flooded by reports of species-site relationships from which some are just mere artefacts. However, are all studies affected, and how serious the problem is for those which are?

In this study, I suggest that whether the results based on CWM approach are overly optimistic very much depends on the exact formulation of the tested hypothesis. Species attributes are related to site attributes via the species composition of a community at each site. The relationship involves testing two questions at the same time: is the species composition related to site attributes, and is the species composition related to species attributes? If one of the relationships is known or can be assumed to exist, only one of the two needs to be tested. Three categories of hypotheses tested by CWM approach can be therefore distinguished, and while two return overly optimistic results if tested in a standard way, one does not. This distinction is essential for evaluation of how much the results of published studies are trustable. I also show that if the study falls into the category where overly optimistic results should be expected, it depends on the data properties how optimistic the results will be (mainly on how different is the species composition of the communities from each other).

When using results of the CWM approach studies, one needs to pay attention to what hypothesis the authors were testing and which test they used for it, and apply guidelines introduced here to get informed estimate how reliable the results of the study are.

This is a plain language summary for the Synthesis paper of David Zelený published in the Journal of Vegetation Science.