Computational protocol: Millimeter-scale genetic gradients and community-level molecular convergence in a hypersaline microbial mat

[…] Mat core samples were collected around 1400 hours from pond 4 near pond 5 at the Exportadora de Sal saltworks, Guerrero Negro, Baja California Sur, Mexico. The salinity of the bulk water above the mat was ∼9% (∼3 × the salinity of sea water). Other metadata for the sample can be found in . Four replicate cores were collected, sectioned into layers with sterile scalpels and DNA extracted, normalized, pooled and sequenced as described in . Metagenome sequence data are available under the following GenBank accession numbers: ABPP00000000, ABPQ00000000, ABPR00000000, ABPS00000000, ABPT00000000, ABPU00000000, ABPV00000000, ABPW00000000, ABPX00000000, ABPY00000000Community composition analysis was performed using the consensus of (i) best BlastP hits () to the IMG/M database () and (ii) phylogenetic mapping of signature genes on a phylogenetic tree (). See for details.Gene-based functional gradients were calculated as follows: genes were assigned to their COG families () and pfam domains () based on rpsBLAST (). The gradients were examined for possible over-representation of groups or individual families or domains, and 1000 bootstrap iterations were used to assess the significance of over-representation. The described gradients were independently confirmed using two databases: IMG/M () and the STRING database (). Further details as well as groupings of families/domains are described in .Isoelectric point distributions, amino-acid composition and GC content were computed using appropriate perl scripts and modules as described in . […]

