Computational protocol: What factors potentially influence the ability of phylogenetic distance to predict trait dispersion in a temperate forest?

Similar protocols

Protocol publication

[…] To reduce the bias in the measurements due to the polytomies among the terminal taxa reconstructed by Phylomatic (Webb & Donoghue, ), particularly for MNTD, we reconstructed the phylogenetic relationship using DNA barcoding (Kress et al., ). Three sequences for each species were collected from GenBank (two plastid DNA genes for rbcL and matK and one nuclear DNA gene for ITS; some species were also sequenced following Kress et al. (), see the methods in Liu et al. ()). Only one of the 41 species did not have any of the three sequences, and we used the rbcL sequence from a congeneric species as a proxy. All sequences were aligned using MUSCLE (Edgar, ). We aligned the sequences of rbcL and matK globally, and we aligned the ITS sequences within orders or families. We combined the aligned rbcL, matK, and multiple ITS sequences into a supermatrix using the supermat function of the phylotools package implemented in R‐3.2.5 (R Core Team ). This supermatrix was then input into raxmlGUI version 1.5b1 (Silvestro & Michalak, ) to construct a maximum likelihood phylogeny. To maintain the topology coincident with the APG III phylogeny, we used an order‐level constraint tree constructed by Phylomatic to retain deep nodes a priori (Kress et al., ; Muscarella et al., ). This maximum likelihood tree was calibrated by nonparametric rate smoothing in the software r8s (Sanderson, ) to obtain an ultrametric phylogenetic tree (Figure ), which was further used to calculate the phylogenetic dispersion. [...] We used the observed MPD and MNTD (Webb et al., ) to quantify the phylogenetic dispersion and make a comparison with the null communities. We generated 999 null communities via randomization of species names on the phylogenetic tree, and then we calculated the standardized effect size of MPD and MNTD (i.e., SES.MPD and SES.MNTD). The calculation was as follows:SES.MPD=(MPDobs−mean(MPDnull))/SD(MPDnull) SES.MNTD=(MNTDobs−mean(MNTDnull))/SD(MNTDnull)where MPD and MNTD are the mean pairwise phylogenetic distance and the mean nearest taxon distance between all individuals within an observed (i.e., MPDobs and MNTDobs) or random community (i.e., MPDnull and MNTDnull), respectively. While the negative SES.MPD and SES.MNTD values represented phylogenetic clustering, the positive values indicated phylogenetic overdispersion. These analyses were repeated at multiple spatial scales (10 m × 10 m, 20 m × 20 m, 30 m × 30 m and 50 m × 50 m) and size classes (small, medium, and large). The size classes of the canopy species were divided into three stages: small (dbh ≤ 5.0 cm), medium (5.0 < dbh ≤ 10.0 cm), and large (dbh > 10.0 cm; Piao, Comita, Jin, & Kim, ). We regarded the species that had a maximum dbh that reached the maximum size class in our study (i.e., 10.0 cm) as the canopy species (28 species in total; Swenson, Enquist, Thompson, & Zimmerman, ; Yang et al., ). In addition, we performed these analyses for all 41 species (i.e., all individuals) for comparative analyses.For simplification, we quantified trait dispersion using the same metrics and formula as phylogenetic dispersion (i.e., SES.MPD and SES.MNTD). The trait dendrograms were constructed for all eight traits and the eight individual traits to quantify trait dispersions for a comparison with the phylogenetic results (Swenson, Erickson, et al. ). To reduce trait redundancy when analyzing all traits, we first calculated the principal components (PCs) for all species and canopy species. Then, we chose the first five PCs (which explained 94.2% of the variation) for all species and the first four PCs (which explained 91.6% of the variation) for the canopy species to calculate the trait Euclidean distance matrix. Finally, dendrograms for all traits and for each of the eight individual traits were generated by performing hierarchical clustering. Prior to these analyses, all traits were log‐transformed and scaled to approximately a mean of zero with unit variance (Swenson, ). The trait dispersion was then quantified following the same steps as the computation of SES.MPD and SES.MNTD, using a trait dendrogram instead of phylogenetic tree.Phylogenetic signal tests at the species pool level and the community level were implemented using Blomberg's K statistic (Blomberg, Garland, Ives, & Crespi, ). At the species pool level, all 41 species for all trees and 28 canopy species for three size classes (i.e., small, medium, and large) were used to test phylogenetic signals. At the community level, species occurring in a community were pruned from the phylogenetic tree of the species pool to generate a specific community‐level phylogenetic tree; this pruned tree was used to test phylogenetic signals, and the process was performed in all subcommunities in the FDP. We selected SLA, LA, and LT to compare the phylogenetic signals between the species pool level and the community level, respectively, because they showed stronger phylogenetic signals at the species pool level (K > 1; Table ). In addition, we divided all communities into phylogenetically clustered and overdispersed communities based on SES.MPD; then, we compared the difference of phylogenetic signals within communities with the different dispersion patterns. To test the significance of the K values, we randomly shuffled the trait data on the phylogenetic tree 999 times to generate null distributions and calculate P values (Swenson, ). These tests were implemented using the multiPhylosignal function of the picante package. […]

Pipeline specifications