An open-submission data portal for processing, analyzing, sharing and disseminating metagenomic datasets. The system currently hosts over 200,000 datasets and is continuously updated. The volume of submissions has increased 4-fold over the past 24 months, now averaging 4 terabasepairs per month. In addition to several new features, we report changes to the analysis workflow and the technologies used to scale the pipeline up to the required throughput levels.
Argonne National Laboratory, Mathematics and Computer Science Division, Argonne, IL, USA; University of Chicago, Chicago, IL, USA; Purdue University, School of Electrical & Computer Engineering, West Lafayette, IN, USA; Purdue University, Department of Computer Sciences, West Lafayette, IN, USA
MG-RAST funding source(s)
This work was supported in part by the NIH award U01HG006537 “OSDF: Support infrastructure for NextGen sequence storage, analysis, and management,” by the Gordon and Betty Moore Foundation with the grant “6-34881, METAZen-Going the Last Mile for Solving the Metadata Crisis),” and by the U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research, under contract DE-AC02-06CH11357 as part of “Resource Aware Intelligent Network Services (RAINS).”