HM16STR All - 16S rRNA Trimmed Data Set
Raw 16S sequence reads must be processed before they can be used to infer useful taxonomic information. The HMP DCC performed baseline processing and analysis of all 16S variable region sequences generated from >14,000 samples from both healthy human subjects and subjects associated with fourteen demonstration projects investigating correlations between the microbiome and human health and disease. Here we provide access to all trimmed, deconvoluted fasta files.
These trimmed datasets were then processed by a pipeline that ran the following analysis steps: a) 16S reference alignment via the NAST-iEr alignment tool; b) chimera identification via ChimeraSlayer; c) aberrant sequence identification via WigeoN; and d) taxonomic binning using the RDP classifier. The first three steps were performed using components of the Broad Institute’s Microbiome Utilities Portal.
Protocols and Tools