20 NMDS

20.1 Ordination plot of disimilarites using NMDS

In the dune case, the multiple dimensions are the many species. What patterns exist among them? Which seem to like similar conditions and tend to appear at the same sites, and vice versa. Also, which sites are similar to each other, in respect of what grows there, and which are different? These are the questions that ordination plots are intended to shed light on. For data based on the presence/absence/abundance of species, Non-Metric Multidimensional Scaling (NMDS) is a widely used type of ordination technique.

Here, we will entirely skate over the details, but will carry out the analysis using the vegan function metaMDS() then plot the result.

We will use as an example the dune data set that comes as part of the vegan package. This has ordinal plant species abundance data in columns and sites in rows. There are 20 sites and 30 species altogether, although most sites only have a few of these.

library(tidyverse)
library(vegan)
data(dune)
glimpse(dune)

Rows: 20
Columns: 30
$ Achimill <dbl> 1, 3, 0, 0, 2, 2, 2, 0, 0, 4, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0
$ Agrostol <dbl> 0, 0, 4, 8, 0, 0, 0, 4, 3, 0, 0, 4, 5, 4, 4, 7, 0, 0, 0, 5
$ Airaprae <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 3, 0
$ Alopgeni <dbl> 0, 2, 7, 2, 0, 0, 0, 5, 3, 0, 0, 8, 5, 0, 0, 4, 0, 0, 0, 0
$ Anthodor <dbl> 0, 0, 0, 0, 4, 3, 2, 0, 0, 4, 0, 0, 0, 0, 0, 0, 4, 0, 4, 0
$ Bellpere <dbl> 0, 3, 2, 2, 2, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0
$ Bromhord <dbl> 0, 4, 0, 3, 2, 0, 2, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
$ Chenalbu <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0
$ Cirsarve <dbl> 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
$ Comapalu <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0
$ Eleopalu <dbl> 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 4, 5, 8, 0, 0, 0, 4
$ Elymrepe <dbl> 4, 4, 4, 4, 4, 0, 0, 0, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
$ Empenigr <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0
$ Hyporadi <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 2, 0, 5, 0
$ Juncarti <dbl> 0, 0, 0, 0, 0, 0, 0, 4, 4, 0, 0, 0, 0, 0, 3, 3, 0, 0, 0, 4
$ Juncbufo <dbl> 0, 0, 0, 0, 0, 0, 2, 0, 4, 0, 0, 4, 3, 0, 0, 0, 0, 0, 0, 0
$ Lolipere <dbl> 7, 5, 6, 5, 2, 6, 6, 4, 2, 6, 7, 0, 0, 0, 0, 0, 0, 2, 0, 0
$ Planlanc <dbl> 0, 0, 0, 0, 5, 5, 5, 0, 0, 3, 3, 0, 0, 0, 0, 0, 2, 3, 0, 0
$ Poaprat  <dbl> 4, 4, 5, 4, 2, 3, 4, 4, 4, 4, 4, 0, 2, 0, 0, 0, 1, 3, 0, 0
$ Poatriv  <dbl> 2, 7, 6, 5, 6, 4, 5, 4, 5, 4, 0, 4, 9, 0, 0, 2, 0, 0, 0, 0
$ Ranuflam <dbl> 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 2, 2, 2, 2, 0, 0, 0, 4
$ Rumeacet <dbl> 0, 0, 0, 0, 5, 6, 3, 0, 2, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0
$ Sagiproc <dbl> 0, 0, 0, 5, 0, 0, 0, 2, 2, 0, 2, 4, 2, 0, 0, 0, 0, 0, 3, 0
$ Salirepe <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 3, 5
$ Scorautu <dbl> 0, 5, 2, 2, 3, 3, 3, 3, 2, 3, 5, 2, 2, 2, 2, 0, 2, 5, 6, 2
$ Trifprat <dbl> 0, 0, 0, 0, 2, 5, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
$ Trifrepe <dbl> 0, 5, 2, 1, 2, 5, 2, 2, 3, 6, 3, 3, 2, 6, 1, 0, 0, 2, 2, 0
$ Vicilath <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 0, 0, 0, 0, 0, 0, 1, 0, 0
$ Bracruta <dbl> 0, 0, 2, 2, 2, 6, 2, 2, 2, 2, 4, 4, 0, 0, 4, 4, 0, 6, 3, 4
$ Callcusp <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 3, 0, 0, 0, 3

ord <- metaMDS(dune)

Run 0 stress 0.1192678 
Run 1 stress 0.1183186 
... New best solution
... Procrustes: rmse 0.02027061  max resid 0.06496285 
Run 2 stress 0.1183186 
... New best solution
... Procrustes: rmse 1.371255e-06  max resid 2.943121e-06 
... Similar to previous best
Run 3 stress 0.1183186 
... Procrustes: rmse 4.739245e-06  max resid 1.521716e-05 
... Similar to previous best
Run 4 stress 0.1192678 
Run 5 stress 0.1812932 
Run 6 stress 0.1192678 
Run 7 stress 0.1183186 
... Procrustes: rmse 2.624054e-06  max resid 9.232593e-06 
... Similar to previous best
Run 8 stress 0.2014886 
Run 9 stress 0.1183186 
... Procrustes: rmse 3.516434e-06  max resid 1.241299e-05 
... Similar to previous best
Run 10 stress 0.1183186 
... Procrustes: rmse 3.715578e-06  max resid 1.278401e-05 
... Similar to previous best
Run 11 stress 0.1183186 
... Procrustes: rmse 3.622406e-06  max resid 9.459851e-06 
... Similar to previous best
Run 12 stress 0.1922241 
Run 13 stress 0.1192678 
Run 14 stress 0.1183186 
... Procrustes: rmse 2.153737e-05  max resid 6.164729e-05 
... Similar to previous best
Run 15 stress 0.1183186 
... Procrustes: rmse 2.851685e-06  max resid 9.468153e-06 
... Similar to previous best
Run 16 stress 0.1183186 
... Procrustes: rmse 1.675375e-05  max resid 5.20301e-05 
... Similar to previous best
Run 17 stress 0.1192679 
Run 18 stress 0.1192678 
Run 19 stress 0.1192679 
Run 20 stress 0.1192678 
*** Best solution repeated 9 times

plot(ord, type = "t")
points(ord, display = "sites", cex = 0.2, pch=21, col="red", bg="yellow")

#text(ord, display = "spec", cex=0.7, col="blue")

The plot shows in two dimensions the sites, here numbered 1-18, and in red, the species. Sites that appear close together on this plot are more similar than sites that are far apart. Species that are close together on the plot tend to occur together, while species that are far apart do so less often.

Check that this plot accords with the dendrogram above.

Improvements to this plot can be made using ggplot - plenty of blogs exist that tell you how to do this.

20.2 How well does the NMDS plot reflect true patterns in the data?

This can be quantified by calculating the ‘stress’ of the process. A low value indicates that the NMDS ordination plot is capturing the true patterns in the data well. Anything above about 0.2 means that it is doing a poor job and that you may need to transform the data in some way and try again.

ord$stress

[1] 0.1183186

We find here a stress value of less than 0.2, so we can infer that the NMDS plot gives a useful indication of patterns in the data.

A Shephard Plot is another way to visually indicatye the degree of stress. If the data ate tightly grouped around a diagonal straight line then there is low strees. If they are, the stres is higher.

stressplot(ord)

For our data, the plot indicates low stress, in accordance with the direct measurement of the stress value.