HMMCP Healthy

Following a July 2010 16S data freeze, data was downloaded from NCBI SRA projects SRP002395: Human Microbiome Project 16S rRNA Clinical Production Phase I, and SRP002012: Human Microbiome Project 454 Clinical Production Pilot. This dataset corresponds to over 5,700 samples and over 10,000 sequence preps. 16S variable region 3-5 (V35) was sequenced for the entire set of samples, and variable regions 1-3 (V13) and 6-9 (V69) for a subset of samples.

A 16S data processing pipeline was implemented using the mothur software package, using both a high and low stringency approach. The high stringency approach provides an output with more aggressive sequence error reduction tailored towards Operational Taxonomic Unit (OTU) construction, while the low stringency approach favors longer read lengths tailored towards taxonomic classification. The mothur output from both high and low stringency approaches is available here, for all three 16S variable regions analyzed. Descriptions of the file types can be found in the file format readme available for each of the two approaches. We also provide the reference alignments and training sets required to replicate these processes. See the mothur SOP below and Schloss, Gevers and Westcott (2011) for more information.

If you're interested in joint analysis of 16S and shotgun metagenomic datasets from the HMP, pairing up data from the same microbiome samples can initially seem tricky. The HMP Sample Flow Schematic indicates how these sample IDs are related experimentally, and provides tables joining 16S dataset "SN" and "PSN" identifiers with metagenomic dataset "SRS" identifiers.

Reference Files
File
Download
Size
MD5 Checksum
 
 
 
  
  
  
Mothur Output Files
File
header
V13 Download
V13 Size
V13 MD5
V35 Download
V35 Size
V35 MD5
V69 Download
V69 Size
V69 MD5
 
 
 
  
  
  
Member Organizations
Loading...