Similarity and Distance Measures
|Top Previous Next|
Choose the similarity measure you wish to calculate from the Similarity drop-down menu. The calculations will appear on the Similarity tab at the bottom of the program window. For ease of use, the program will highlight sites with similarity above a certain level. You can set this level by entering the number in the Set threshold level box at the bottom of the page.
The similarity measures used.
These are simple measures of either the extent to which two habitats have species in common (Q analysis) or which variables (species) have habitats in common (R analysis ). Binary similarity coefficients use presence-absence data; following the introduction of computers, more complex quantitative coefficients became practicable. Analysis of quantitative, rather than presence-absence, data with a binary method may report a perfect similarity between every sample/site in data sets (such as the Romano British pottery demo data set) in which each variable is present in every sample.
Both groups of indices can be further divided between those which take account of the absence from both communities (double zero methods) and those which do not. In most ecological applications it is unwise to use double-zero methods as they assign a high level of similarity to localities which both lack many species; a problem which becomes particularly acute in habitats which have a potentially extremely large species list, such as the marine benthos.
A good account of similarity and distance measures is given in Legendre & Legendre (1983). Because of division by zero problems for some data sets not all measures can be calculated. When a division by zero error would occur CAP gives an index of -99.
For measures of similarity between samples based on species presence-absence, the observations can be summarised in a simple frequency table:
where the number of species present in both samples is a, the number of species present in sample 1 but missing from sample 2 is b, the number of species missing in sample 1 but present in sample 2 is c and the number of species missing from both samples is d. The total number of species, N, is therefore a+b+c+d.
Binary - double zeros
Binary - no double zeros