American Gut File Sets ---------------------- The American Gut data has been packaged into directories for easy access. Datasets ======== American Gut data was primarily divided by sequence trim length, bodysite and rarefaction depth. Trimmed sequences were used to facilitate meta analysis with other datasets. Participant results are returned at a lower rarefaction depth than are used in analysis. We've chosen to organize the data by bodysite, trim length, and then by rarefaction depth. For each bodysite and trim length, an unrarefied mapping file and biom table are provided. Data is additionally partitioned into all the samples from all participants at a particiular bodysite, and a single sample per indiviudual at each bodysite. The single sample was selected at random from all the samples from an individual which met the rarefaction criteria. This sample is used across all datasets. If no sample met the rarefaction critieria, the level is stepped down and the proceedure is repeated. This way, the sample selected for each individual is consistent across all the datasets while maximizing the number of samples for analysis. A list of the barcodes in each single sample dataset are provided for each bodysite (i.e. `single_1k.txt`). For the fecal samples, we additionally filtered the data to include samples from individuals in a healthy subset of adults. The criteria for a participant to be included in this group included a reported age between 20 and 69, a BMI between 18.5 and 30, and no reported history of Inflammatory Bowel Disease, Diabetes, or antibiotic use in the past year. A single sample per individual is provided in each subset. Data Dictionary =============== A data dictionary describing all the base columns in the mapping file is provided as the `data_dictionary.csv` in the parent partition directory. Files ===== Within each dataset directory, the following files are provided: Metadata file +++++++++++++ The mapping file downloaded from EBI. Alpha diversity (PD whole tree, shannon, choa1, and observed OTUs) for the rarefaction depth, and every depth lower have been added. OTU table +++++++++ A rarefied biom table. Distance Matrices +++++++++++++++++ The weighted and unweighted UniFrac distance associated with the set of samples. Analysis Directory Structure ============================ fecal/ single_ids_1k.txt single_ids_10k.txt single_ids_unrarefied.txt 100nt/ ag_fecal.biom ag_fecal.txt all_participants/ all_samples/ 1k/ ag_1k_fecal.biom ag_1k_fecal.txt unweighted_unifrac_ag_1k_fecal.txt weighted_unifrac_ag_1k_fecal.txt 10k/ ag_10k_fecal.biom ag_10k_fecal.txt unweighted_unifrac_ag_10k_fecal.txt weighted_unifrac_ag_10k_fecal.txt one_sample/ 1k/ ag_1k_fecal.biom ag_1k_fecal.txt unweighted_unifrac_ag_1k_fecal.txt weighted_unifrac_ag_1k_fecal.txt 10k/ ag_10k_fecal.biom ag_10k_fecal.txt unweighted_unifrac_ag_10k_fecal.txt weighted_unifrac_ag_10k_fecal.txt sub_participants/ all_samples/ 1k/ ag_1k_fecal.biom ag_1k_fecal.txt unweighted_unifrac_ag_1k_fecal.txt weighted_unifrac_ag_1k_fecal.txt 10k/ ag_10k_fecal.biom ag_10k_fecal.txt unweighted_unifrac_ag_10k_fecal.txt weighted_unifrac_ag_10k_fecal.txt one_sample/ 1k/ ag_1k_fecal.biom ag_1k_fecal.txt unweighted_unifrac_ag_1k_fecal.txt weighted_unifrac_ag_1k_fecal.txt 10k/ ag_10k_fecal.biom ag_10k_fecal.txt unweighted_unifrac_ag_10k_fecal.txt weighted_unifrac_ag_10k_fecal.txt notrim/ 100nt/ ag_fecal.biom ag_fecal.txt all_participants/ all_samples/ 1k/ ag_1k_fecal.biom ag_1k_fecal.txt unweighted_unifrac_ag_1k_fecal.txt weighted_unifrac_ag_1k_fecal.txt 10k/ ag_10k_fecal.biom ag_10k_fecal.txt unweighted_unifrac_ag_10k_fecal.txt weighted_unifrac_ag_10k_fecal.txt one_sample/ 1k/ ag_1k_fecal.biom ag_1k_fecal.txt unweighted_unifrac_ag_1k_fecal.txt weighted_unifrac_ag_1k_fecal.txt 10k/ ag_10k_fecal.biom ag_10k_fecal.txt unweighted_unifrac_ag_10k_fecal.txt weighted_unifrac_ag_10k_fecal.txt sub_participants/ all_samples/ 1k/ ag_1k_fecal.biom ag_1k_fecal.txt unweighted_unifrac_ag_1k_fecal.txt weighted_unifrac_ag_1k_fecal.txt 10k/ ag_10k_fecal.biom ag_10k_fecal.txt unweighted_unifrac_ag_10k_fecal.txt weighted_unifrac_ag_10k_fecal.txt one_sample/ 1k/ ag_1k_fecal.biom ag_1k_fecal.txt unweighted_unifrac_ag_1k_fecal.txt weighted_unifrac_ag_1k_fecal.txt 10k/ ag_10k_fecal.biom ag_10k_fecal.txt unweighted_unifrac_ag_10k_fecal.txt weighted_unifrac_ag_10k_fecal.txt oral/ single_ids_1k.txt single_ids_10k.txt single_ids_unrarefied.txt 100nt/ ag_oral.biom ag_oral.txt all_participants/ all_samples/ 1k/ ag_1k_oral.biom ag_1k_oral.txt unweighted_unifrac_ag_1k_oral.txt weighted_unifrac_ag_1k_oral.txt 10k/ ag_10k_oral.biom ag_10k_oral.txt unweighted_unifrac_ag_10k_oral.txt weighted_unifrac_ag_10k_oral.txt one_sample/ 1k/ ag_1k_oral.biom ag_1k_oral.txt unweighted_unifrac_ag_1k_oral.txt weighted_unifrac_ag_1k_oral.txt 10k/ ag_10k_oral.biom ag_10k_oral.txt unweighted_unifrac_ag_10k_oral.txt weighted_unifrac_ag_10k_oral.txt notrim/ 100nt/ ag_oral.biom ag_oral.txt all_participants/ all_samples/ 1k/ ag_1k_oral.biom ag_1k_oral.txt unweighted_unifrac_ag_1k_oral.txt weighted_unifrac_ag_1k_oral.txt 10k/ ag_10k_oral.biom ag_10k_oral.txt unweighted_unifrac_ag_10k_oral.txt weighted_unifrac_ag_10k_oral.txt one_sample/ 1k/ ag_1k_oral.biom ag_1k_oral.txt unweighted_unifrac_ag_1k_oral.txt weighted_unifrac_ag_1k_oral.txt 10k/ ag_10k_oral.biom ag_10k_oral.txt unweighted_unifrac_ag_10k_oral.txt weighted_unifrac_ag_10k_oral.txt skin/ single_ids_1k.txt single_ids_10k.txt single_ids_unrarefied.txt 100nt/ ag_skin.biom ag_skin.txt all_participants/ all_samples/ 1k/ ag_1k_skin.biom ag_1k_skin.txt unweighted_unifrac_ag_1k_skin.txt weighted_unifrac_ag_1k_skin.txt 10k/ ag_10k_skin.biom ag_10k_skin.txt unweighted_unifrac_ag_10k_skin.txt weighted_unifrac_ag_10k_skin.txt one_sample/ 1k/ ag_1k_skin.biom ag_1k_skin.txt unweighted_unifrac_ag_1k_skin.txt weighted_unifrac_ag_1k_skin.txt 10k/ ag_10k_skin.biom ag_10k_skin.txt unweighted_unifrac_ag_10k_skin.txt weighted_unifrac_ag_10k_skin.txt notrim/ 100nt/ ag_skin.biom ag_skin.txt all_participants/ all_samples/ 1k/ ag_1k_skin.biom ag_1k_skin.txt unweighted_unifrac_ag_1k_skin.txt weighted_unifrac_ag_1k_skin.txt 10k/ ag_10k_skin.biom ag_10k_skin.txt unweighted_unifrac_ag_10k_skin.txt weighted_unifrac_ag_10k_skin.txt one_sample/ 1k/ ag_1k_skin.biom ag_1k_skin.txt unweighted_unifrac_ag_1k_skin.txt weighted_unifrac_ag_1k_skin.txt 10k/ ag_10k_skin.biom ag_10k_skin.txt unweighted_unifrac_ag_10k_skin.txt weighted_unifrac_ag_10k_skin.txt