PiscesLogoSmallerStill Variable Filtering

Top  Previous  Next

Variable filtering is a novel ordination method developed by Peter Henderson and Richard Seaby of Pisces Conservation Ltd. In earlier versions of CAP it was called Species filtering, however since the variables are not inevitably species, its name has been changed to reflect this. The description below refers throughout to 'species', but plainly the variables could consist of any other suitable entity.

 

The utility of the method is yet to be assessed, but we believe that it has many useful features. In particular, it produces a 2-dimensional ordination of the sites that has a clear biological interpretation in terms of the species present in each sample. Our objective was a final result that would produce an ordination of sites with the greatest possible discrimination of sites along axes that allowed simple ecological interpretation. Ideally the method would work for both presence/absence and quantitative abundance data.

 

The method ordinates the sites in a two-dimensional space. The first dimension is an ordination in terms of species content and the second is simply a plot of the number of species present. The key feature is the unique way in which the sites are ordinated along axis 1. This will be described in detail below.

 

Imagine a sieve that allowed all the samples that did not hold a species to pass through it, but retained those samples that contained the species. We term such a sieve a negative filter. In contrast a filter that will only allow samples containing a particular species to pass through would be a positive filter. Now consider a series of species sieves placed in line so that progressively more and more of the samples are retained. To produce a useful ordination we need a set of rules that will determine which species should be used as sieves and in which order the sieves should be placed. The proposed rules are quite simple.

 

1.                Species present at all sites are excluded. This is because they cannot be used to differentiate between sites.

2.                When two or more species have the same pattern of occurrence at the sites a single sieve represents them all.

3.                The order of sieves should be that which will produce the most even distribution of sites along the axis.

 

Rule 3 is the key feature that will lead to a useful ordination, and in practice is more difficult to discover than you might suppose. The possible arrangement of species filters will increase factorially with the number of species present so that even with quite a modest species number it is impossible to consider the merits of all possible combinations. The solution to this problem is to use a numerical method termed annealing to seek an optimal solution. While the search for a good solution requires considerable computation, our experience suggests that even with quite large data sets a useful ordination is created within 2 to 5 minutes.

 

When using a negative filter, which we recommend, the final ordination along the first axis will tend to arrange the sites (samples) in a clear and quite particular order. Starting from the left the sites will initially be characterised by their presence of rare or unusual species. At the opposite end of the axis will be sites that only hold the most frequently present species. Sites at the left end of the ordination can be classified into two groups, those that contain both infrequent and frequent species, and sites that hold only infrequent species. Sites that hold both will tend to have higher species richness than those that only hold rarer forms, and this immediately suggests that the second axis should simply be the total number of species in the sample.

 

Annealing works by taking an initially, possibly arbitrary, arrangement of the species and changing this arrangement in a manner that will make it more likely that any superior order will be accepted. There are two types of change allowed. (1) A randomly-chosen length of the species sequence is removed and replaced at a randomly-chosen site or (2) the order of a randomly-chosen length of the species sequence is reversed. When any such change is made a test is undertaken to see if the new arrangement produces a more even distribution of the sites. If it does, the new arrangement is accepted. The temperature comes into play by allowing possibly inferior arrangements to be accepted. The higher the temperature the more likely this is to occur. The reason for this feature is that it tends to allow the program to find a global optimal solution and not become trapped by a local minimum. The higher the temperature the more likely an inferior arrangement is likely to be accepted. Annealing proceeds by a steady and gradual reduction in temperature.

 

The first axis therefore consists of a series of species, which are labelled on the output. These species are the key members of the community, whose presence or absence can be used to distinguish between sites. The plot below shows the result of a variable filter analysis of the Hinkley fish demo data set:

 

variablefiltering