Applying Multivariate Methods using R, CAP and Ecom Peter Henderson & Richard Seaby
NOTE: for data sets for the previous version of this book, A Practical Handbook for Multivariate Methods, please go here.
Applying Multivariate Methods features R code examples, and a wide range of data sets from published work in different fields. So that the reader can really explore the featured methods in depth, we have made the R code and data sets available to be downloaded.
R code
All the R code examples featured in the book are available in a single zipped file - Rexamples.zip Please note: the R code snippets given in the book were tested and shown to work with the data sets supplied, at the time of going to press (June 2019). Please be aware that the various packages used are frequently updated, and so some editing of the code may be necessary. We will endeavour to ensure that the downloadable versions of the code available here are kept updated. However, as the R packages themselves are beyond our control, we cannot guarantee that the code snippets will continue to work, or give the same results, indefinitely, without some modification.
Data sets
To download each data set, simply click on the link. The file size is tiny, usually 2 - 3 kb.
Each data set is in the form of a zipped file, containing the data as a .csv (comma-separated text) file, along with a file with a .pcg extension. This second file holds data relating to group assignments; if you open the .csv file in Community Analysis Package or Ecom, the samples will automatically be assigned to their selected groups, provided the .pcg file is in the same folder as the main data file.
The data to be used in Ecom comprise two data sets; one featuring species/sample data, and the other data relating to environmental variables.
Example from archaeology: the classification of Jomon pottery sherds
Demonstration data set: Jomon_Hall.csv
R demonstration data set: Jomon_Hall R.csv
R demonstration data set: Jomon Hall Shouninzuka R.csv
Reference: Hall, M. E., 2001. Pottery styles during the early Jomon period: geochemical perspectives on the Moroiso and Ukishima pottery styles. Archaeometry 43: 59-75.
Example from geology: An investigation of the Martinsville igneous complex
Demonstration data set: Petrology.csv
R demonstration data set: Petrology R.csv
Reference: P. C. Ragland, J. F. Conley, W. C. Parker, and J. A. Van Orman, 1997, Use of Principal Components Analysis in petrology: an example from the Martinsville igneous complex, Virginia, U.S.A. Mineralogy and Petrology 60:165-184.
Example from biology: Comparing the songs of cicadas
Demonstration data set: cicada.csv
Reference: Ohya, E., 2004. Identification of Tibicen cicada species by a Principal Components Analysis of their songs. Anais da Academia Brasileira de Ciéncias 76: 441-444
Example from biology: Analysis of community change with climate warming
Demonstration data set: Hinkley fish.csv
R demonstration data set: Hinkley fish R.csv
Reference: Henderson, P. A., 2007. Discrete and continuous change in the fish community of the Bristol Channel in response to climate change. J. Mar. Biol. Ass. U.K. (2007), 87: 589?598
Chapter 4: Correspondence Analysis (CA)
Example from marketing: Soft drink consumption
Demonstration data set: Beverages.csv
R demonstration data set: Beverages R2.csv
Reference: Hoffman; D. L. & Franke, G. R., 1986, Correspondence Analysis: Graphical Representation of Categorical Data in Marketing Research. Journal of Marketing Research, 23: 213-227.
Example from archaeology: The temporal relationship between the pots from trenches on the Greek island of Melos
Demonstration data set: Melos.csv
R demonstration data set: Melos R grouped.csv
Reference: Berg; I. & Blieden, S., 2000. The Pots of Phylakopi: Applying Statistical Techniques to Archaeology. Chance, 13: 8-15.
Example from biology: Analysis of community change with climate warming
Demonstration data set: Hinkley fish.csv
R demonstration data set: Hinkley fish R.csv
Reference: Henderson, P. A., 2007. Discrete and continuous change in the fish community of the Bristol Channel in response to climate change. J. Mar. Biol. Ass. U.K. (2007), 87: 589-598
Chapter 5: Multidimensional Scaling (MDS)
Example from archaeology: Seriation of Nigerian pottery
Demonstration data set: Nigerian pottery.csv
Reference: Usman, A. A., 2003, Ceramic Seriation, Sites Chronology, and Old Oyo Factor in North-central Yorubaland, Nigeria African Archaeological Review, 20: 149-169.
Example from biology: Analysis of fish community change
Demonstration data set: Hinkley annual fish.csv
R demonstration data set: Hinkley fish R.csv
Reference: Henderson, P. A., 2007. Discrete and continuous change in the fish community of the Bristol Channel in response to climate change. J. Mar. Biol. Ass. U.K. (2007), 87: 589-598
Example from geology: Ordovician mollusc assemblages
Demonstration data set: Ordovician fossils.csv
Reference: Novack-Gottshall, P. M., Miller, A. I. 2003, Comparative Taxonomic Richness and Abundance of Late Ordovician Gastropods and Bivalves in Mollusc-rich Strata of the Cincinnati Arch. PALAIOS, 18: 559?571.
Chapter 6: Linear Discriminant Analysis (DA)
Example from biology: Iris systematics
Demonstration data set: irises.csv
Reference: Fisher, R.A. (1936). The Use of Multiple Measurements in Taxonomic Problems. Annals of Eugenics 7: 179?188.
Example from archaeology: Skull shape
Demonstration data set: Egyptian skulls.csv
Reference: Thomson, A. and Randall-Maciver, R. (1905) Ancient Races of the Thebaid, Oxford: Oxford University Press. Also found in: Hand, D.J., et al. (1994) A Handbook of Small Data Sets, New York: Chapman & Hall, pp. 299-301. Manly, B.F.J. (1986) Multivariate Statistical Methods, New York: Chapman & Hall.
Example from archaeology: Chemical analysis of Romano-British pottery
Demonstration data set: Romano British pottery.csv
R demonstration data set: roman pottery R.csv
Reference: Tubb, A., Parker, A.J. and Nickless, G. (1980) The analysis of Romano-British pottery by atomic absorption spectrophotometry. Archaeometry, 22: 153-171. Also found in: Hand, D.J., et al. (1994) A Handbook of Small Data Sets, London: Chapman & Hall, 252.
Example from palaeontology: Palaeogeography of forest trees in the Czech Republic around 2000 BP
Demonstration data set: Pollen data biological.csv and Pollen data environmental.csv
Reference: Petr Pokorný (2002) Palaeogeography of forest trees in the Czech Republic around 2000 BP: Methodical approach and selected results. Preslia, Praha, 74: 235-246.
Chapter 8: TWINSPAN (Two-Way Indicator Species Analysis)
Example from biology: Woody species composition of floodplain forests of the Little River, McCurtain and LeFlore Counties, Oklahoma
Demonstration data set: Oklahoma floodplain forest.csv
Reference: B. W. Hoagland, L. R. Sorrels and S. M. Glenn (1996). Woody species composition of floodplain forests of the Little River, McCurtain and LeFlore Counties, Oklahoma. Proc. Okla. Acad. Sci. 74: 23 - 29.
Reference: Henderson, P. A. (2007) Discrete and continuous change in the fish community of the Bristol Channel in response to climate change. J. Mar. Biol. Ass. UK. 87: 589-598.
Example from biology: Woody species composition of floodplain forests of the Little River, McCurtain and LeFlore Counties, Oklahoma
Demonstration data set: Oklahoma floodplain forest.csv
Reference: B. W. Hoagland, L. R. Sorrels and S. M. Glenn (1996). Woody species composition of floodplain forests of the Little River, McCurtain and LeFlore Counties, Oklahoma. Proc. Okla. Acad. Sci. 74: 23 - 29.
Example from biology: The effect of climate change on an estuarine fish community
Demonstration data sets: Hinkley annual fish.csv and Hinkley fish.csv
Reference: Henderson, P. A. (2007) Discrete and continuous change in the fish community of the Bristol Channel in response to climate change. J. Mar. Biol. Ass. UK. 87: 589-598.
Example from veterinary science: Body-weight changes during growth in puppies of different breed
Demonstration data set: dog growth study.csv
Reference: Hawthorne, A. J. Booles, D., Nugent, P. A., Gettinby, G. and Wilkinson, J. (2004) Body-weight changes during growth in puppies of different breeds. American Society for Nutritional Sciences. J. Nutr. 134: 2027S?2030S.
Example from linguistics: Authors' characteristic writing styles as seen through their use of commas
Demonstration data set: comma placement.csv
Reference: Jin, M. & Murakami, M. (1993) Authors' characteristic writing styles as seen through their use of commas. Behaviormetrika, 20: 63-74.
Example: cluster analysis and heat maps in R
Demonstration data sets Hinkley annual data.csv
Reference: Henderson, P. A. (2007) Discrete and continuous change in the fish community of the Bristol Channel in response to climate change. J. Mar. Biol. Ass. UK. 87: 589-598.
Chapter 10: Analysis of Similarities (ANOSIM)
Example from archaeology: The classification of Jomon pottery sherds
Demonstration data set: Jomon_Hall.csv
Reference: Hall, M. E., 2001. Pottery styles during the early Jomon period: geochemical perspectives on the Moroiso and Ukishima pottery styles. Archaeometry 43: 59-75.