#SampleID	BarcodeSequence	LinkerPrimerSequence	center_name	center_project_name	emp_status	experiment_design_description	key_seq	library_construction_protocol	linker	platform	region	run_center	run_date	run_prefix	samp_size	sample_center	sequencing_meth	study_center	target_gene	target_subfragment	age	age_unit	altitude	anonymized_name	assigned_from_geo	body_habitat	body_product	body_site	collection_timestamp	country	depth	dna_extracted	elevation	env_biome	env_feature	env_matter	has_physical_specimen	host_subject_id	host_taxid	latitude	longitude	physical_specimen_remaining	project_name	required_sample_info_status	sample_type	sex	taxon_id	title	Description
232.F10Space217	ATCGCTCGAGGA	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	F10Space217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	F1	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Space_bar	female	408169	Forensic_identification_using_skin_bacterial_communities	Space_bar
232.F11Space217	ATCTACTACACG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	F11Space217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	F1	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Space_bar	female	408169	Forensic_identification_using_skin_bacterial_communities	Space_bar
232.F12Space217	ATCTGGTGCTAT	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	F12Space217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	F1	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Space_bar	unknown	408169	Forensic_identification_using_skin_bacterial_communities	Space_bar
232.L1Space217	ATGCAGCTCAGT	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	L1Space217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	L1	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Space_bar	unknown	408169	Forensic_identification_using_skin_bacterial_communities	Space_bar
232.L3Space217	ATGCGTAGTGCG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	L3Space217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	L3	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Space_bar	unknown	408169	Forensic_identification_using_skin_bacterial_communities	Space_bar
232.M10Space217	ATCGCGGACGAT	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M10Space217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M1	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Space_bar	male	408169	Forensic_identification_using_skin_bacterial_communities	Space_bar
232.M11Space217	ATCGTACAACTC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M11Space217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M1	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Space_bar	male	408169	Forensic_identification_using_skin_bacterial_communities	Space_bar
232.M2Akey217	ACATGATCGTTC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Akey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Akey	male	408169	Forensic_identification_using_skin_bacterial_communities	Akey
232.M2Bkey217	ACGCGATACTGG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Bkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Bkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Bkey
232.M2Ckey217	ACGATGCGACCA	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Ckey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Ckey	male	408169	Forensic_identification_using_skin_bacterial_communities	Ckey
232.M2Dkey217	ACATTCAGCGCA	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Dkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Dkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Dkey
232.M2Ekey217	ACACTGTTCATG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Ekey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Ekey	male	408169	Forensic_identification_using_skin_bacterial_communities	Ekey
232.M2Enter217	ACGGTGAGTGTC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Enter217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Ente	male	408169	Forensic_identification_using_skin_bacterial_communities	Ente
232.M2Fkey217	ACCACATACATC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Fkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Fkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Fkey
232.M2Gkey217	ACCAGACGATGC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Gkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Gkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Gkey
232.M2Hkey217	ACCAGCGACTAG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Hkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Hkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Hkey
232.M2Ikey217	ACAGTGCTTCAT	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Ikey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Ikey	male	408169	Forensic_identification_using_skin_bacterial_communities	Ikey
232.M2Indl217	AACTCGTCGATG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	36	years	0	M2Indl217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M2	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M2Indr217	AATCGTGACTCG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	36	years	0	M2Indr217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M2	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M2Jkey217	ACCGCAGAGTCA	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Jkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Jkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Jkey
232.M2Kkey217	ACCTCGATCAGA	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Kkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Kkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Kkey
232.M2Lkey217	ACCTGTCTCTCT	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Lkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Lkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Lkey
232.M2Lsft217	ACGGATCGTCAG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Lsft217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Left_shift	male	408169	Forensic_identification_using_skin_bacterial_communities	Left_shift
232.M2Midl217	AACTGTGCGTAC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	36	years	0	M2Midl217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M2	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M2Midr217	ACACACTATGGC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	36	years	0	M2Midr217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M2	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M2Mkey217	ACGCTATCTGGA	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Mkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Mkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Mkey
232.M2Nkey217	ACGCGCAGATAC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Nkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Nkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Nkey
232.M2Okey217	ACAGTTGCGCGA	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Okey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Okey	male	408169	Forensic_identification_using_skin_bacterial_communities	Okey
232.M2Pinl217	AAGCTGCAGTCG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	36	years	0	M2Pinl217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M2	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M2Pinr217	ACACGAGCCACA	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	36	years	0	M2Pinr217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M2	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M2Pkey217	ACATCACTTAGC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Pkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Pkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Pkey
232.M2Qkey217	ACACGGTGTCTA	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Qkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Qkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Qkey
232.M2Rinl217	AAGAGATGTCGA	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	36	years	0	M2Rinl217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M2	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M2Rinr217	ACACATGTCTAC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	36	years	0	M2Rinr217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M2	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M2Rkey217	ACAGACCACTCA	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Rkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Rkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Rkey
232.M2Rsft217	ACGCTCATGGAT	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Rsft217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Right_shift	male	408169	Forensic_identification_using_skin_bacterial_communities	Right_shift
232.M2Skey217	ACATGTCACGTG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Skey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Skey	male	408169	Forensic_identification_using_skin_bacterial_communities	Skey
232.M2Space217	ACGTACTCAGTG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Space217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Space_bar	male	408169	Forensic_identification_using_skin_bacterial_communities	Space_bar
232.M2Thml217	AACGCACGCTAG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	36	years	0	M2Thml217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M2	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M2Thmr217	AATCAGTCTCGT	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	36	years	0	M2Thmr217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M2	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M2Tkey217	ACAGAGTCGGCT	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Tkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Tkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Tkey
232.M2Ukey217	ACAGCTAGCTTG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Ukey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Ukey	male	408169	Forensic_identification_using_skin_bacterial_communities	Ukey
232.M2Vkey217	ACGCAACTGCTA	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Vkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Vkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Vkey
232.M2Wkey217	ACACTAGATCCG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Wkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Wkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Wkey
232.M2Xkey217	ACGAGTGCTATC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Xkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Xkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Xkey
232.M2Ykey217	ACAGCAGTGGTC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Ykey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Ykey	male	408169	Forensic_identification_using_skin_bacterial_communities	Ykey
232.M2Zkey217	ACGACGTCTTAG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M2Zkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Zkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Zkey
232.M3Akey217	AGTGTTCGATCG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M3Akey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M3	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Akey	male	408169	Forensic_identification_using_skin_bacterial_communities	Akey
232.M3Bkey217	ATATGCCAGTGC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M3Bkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M3	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Bkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Bkey
232.M3Ckey217	ATAGGCGATCTC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M3Ckey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M3	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Ckey	male	408169	Forensic_identification_using_skin_bacterial_communities	Ckey
232.M3Ekey217	AGTCACATCACT	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M3Ekey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M3	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Ekey	male	408169	Forensic_identification_using_skin_bacterial_communities	Ekey
232.M3Gkey217	ATAATCTCGTCG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M3Gkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M3	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Gkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Gkey
232.M3Hkey217	ATACACGTGGCG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M3Hkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M3	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Hkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Hkey
232.M3Indl217	AGCTATCCACGA	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	33	years	0	M3Indl217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M3	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M3Indr217	AGGACGCACTGT	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	33	years	0	M3Indr217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M3	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M3Jkey217	ATACAGAGCTCC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M3Jkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M3	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Jkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Jkey
232.M3Kkey217	ATACGTCTTCGA	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M3Kkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M3	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Kkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Kkey
232.M3Lkey217	ATACTATTGCGC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M3Lkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M3	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Lkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Lkey
232.M3Lsft217	ATCAGGCGTGTG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M3Lsft217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M3	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Left_Shift	male	408169	Forensic_identification_using_skin_bacterial_communities	Left_Shift
232.M3Midl217	AGCTCCATACAG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	33	years	0	M3Midl217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M3	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M3Midr217	AGGCTACACGAC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	33	years	0	M3Midr217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M3	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M3Mkey217	ATCACTAGTCAC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M3Mkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M3	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Mkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Mkey
232.M3Nkey217	ATCACGTAGCGG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M3Nkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M3	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Nkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Nkey
232.M3Pinl217	AGCTGACTAGTC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	33	years	0	M3Pinl217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M3	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M3Pinr217	AGTACGCTCGAG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	33	years	0	M3Pinr217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M3	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M3Pkey217	AGTGTCACGGTG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M3Pkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M3	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Pkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Pkey
232.M3Qkey217	AGTACTGCAGGC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M3Qkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M3	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Qkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Qkey
232.M3Rinl217	AGCTCTCAGAGG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	33	years	0	M3Rinl217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M3	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M3Rinr217	AGGTGTGATCGC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	33	years	0	M3Rinr217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M3	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M3Rkey217	AGTCCATAGCTG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M3Rkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M3	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Rkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Rkey
232.M3Rsft217	ATCCGATCACAG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M3Rsft217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M3	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Right_shift	male	408169	Forensic_identification_using_skin_bacterial_communities	Right_shift
232.M3Space217	ATCGATCTGTGG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M3Space217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M3	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Space_bar	male	408169	Forensic_identification_using_skin_bacterial_communities	Space_bar
232.M3Thml217	AGCGTAGGTCGT	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	33	years	0	M3Thml217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M3	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M3Thmr217	AGCTTGACAGCT	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	33	years	0	M3Thmr217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M3	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M3Tkey217	AGTCTACTCTGA	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M3Tkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M3	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Tkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Tkey
232.M3Vkey217	ATATCGCTACTG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M3Vkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M3	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Vkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Vkey
232.M3Wkey217	AGTAGTATCCTC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M3Wkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M3	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Wkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Wkey
232.M3Xkey217	ATAGCTCCATAC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M3Xkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M3	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Xkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Xkey
232.M3Ykey217	AGTCTCGCATAT	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M3Ykey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M3	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Ykey	male	408169	Forensic_identification_using_skin_bacterial_communities	Ykey
232.M3Zkey217	ATACTCACTCAG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M3Zkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M3	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Zkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Zkey
232.M9Akey217	AGACCGTCAGAC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M9Akey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M9	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Akey	male	408169	Forensic_identification_using_skin_bacterial_communities	Akey
232.M9Bkey217	AGCAGCACTTGT	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M9Bkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M9	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Bkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Bkey
232.M9Ckey217	AGCACACCTACA	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M9Ckey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M9	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Ckey	male	408169	Forensic_identification_using_skin_bacterial_communities	Ckey
232.M9Dkey217	AGACTGCGTACT	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M9Dkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M9	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Dkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Dkey
232.M9Ekey217	ACTCTTCTAGAG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M9Ekey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M9	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Ekey	male	408169	Forensic_identification_using_skin_bacterial_communities	Ekey
232.M9Enter217	AGCGAGCTATCT	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M9Enter217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M9	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Ente	male	408169	Forensic_identification_using_skin_bacterial_communities	Ente
232.M9Fkey217	AGAGAGCAAGTG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M9Fkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M9	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Fkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Fkey
232.M9Gkey217	AGAGCAAGAGCA	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M9Gkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M9	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Gkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Gkey
232.M9Hkey217	AGAGTAGCTAAG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M9Hkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M9	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Hkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Hkey
232.M9Indl217	ACGTGAGAGAAT	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	25	years	0	M9Indl217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M9	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M9Indr217	ACTAGCTCCATA	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	25	years	0	M9Indr217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M9	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M9Kkey217	AGATACACGCGC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M9Kkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M9	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Kkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Kkey
232.M9Midl217	ACGTGCCGTAGA	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	25	years	0	M9Midl217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M9	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M9Midr217	ACTATTGTCACG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	25	years	0	M9Midr217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M9	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M9Mkey217	AGCATATGAGAG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M9Mkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M9	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Mkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Mkey
232.M9Nkey217	AGCAGTCGCGAT	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M9Nkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M9	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Nkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Nkey
232.M9Okey217	ACTTGTAGCAGC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M9Okey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M9	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Okey	male	408169	Forensic_identification_using_skin_bacterial_communities	Okey
232.M9Pinl217	ACTACAGCCTAT	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	25	years	0	M9Pinl217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M9	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M9Pinr217	ACTCAGATACTC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	25	years	0	M9Pinr217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M9	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M9Pkey217	AGAACACGTCTC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M9Pkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M9	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Pkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Pkey
232.M9Qkey217	ACTCGATTCGAT	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M9Qkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M9	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Qkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Qkey
232.M9Rinl217	ACGTTAGCACAC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	25	years	0	M9Rinl217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M9	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M9Rinr217	ACTCACGGTATG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	25	years	0	M9Rinr217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M9	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M9Skey217	AGACGTGCACTG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M9Skey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M9	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Skey	male	408169	Forensic_identification_using_skin_bacterial_communities	Skey
232.M9Space217	AGCGCTGATGTG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M9Space217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M9	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Space_bar	male	408169	Forensic_identification_using_skin_bacterial_communities	Space_bar
232.M9Thml217	ACGTCTGTAGCA	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	25	years	0	M9Thml217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M9	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M9Thmr217	ACTACGTGTGGT	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	25	years	0	M9Thmr217	False	UBERON:skin	UBERON:sebum	UBERON:skin	7/15/08	GAZ:United States of America	0	True	1624	ENVO:human-associated habitat	ENVO:human-associated habitat	ENVO:human-associated habitat	True	M9	9606	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	finger_tip	male	539655	Forensic_identification_using_skin_bacterial_communities	finger_tip
232.M9Vkey217	AGCACGAGCCTA	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M9Vkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M9	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Vkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Vkey
232.M9Wkey217	ACTCGCACAGGA	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M9Wkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M9	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Wkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Wkey
232.M9Xkey217	AGATGTTCTGCT	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M9Xkey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M9	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Xkey	male	408169	Forensic_identification_using_skin_bacterial_communities	Xkey
232.M9Ykey217	ACTGTACGCGTA	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	M9Ykey217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	M9	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Ykey	male	408169	Forensic_identification_using_skin_bacterial_communities	Ykey
232.R1Space217	ATCTCTGGCATA	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	R1Space217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	R1	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Space_bar	unknown	408169	Forensic_identification_using_skin_bacterial_communities	Space_bar
232.U1Space217	ATGACCATCGTG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	U1Space217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	U1	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Space_bar	unknown	408169	Forensic_identification_using_skin_bacterial_communities	Space_bar
232.U2Space217	ATGACTCATTCG	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	U2Space217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	U2	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Space_bar	unknown	408169	Forensic_identification_using_skin_bacterial_communities	Space_bar
232.U3Space217	ATGAGACTCCAC	CATGCTGCCTCCCGTAGGAGT	CCME	Forensic_identification_using_skin_bacterial_communities	EMP	Forensic_identification_using_skin_bacterial_communities	TCAG	16S_rRNA_gene_sequences_were_processed_according_to_the_methods_described_in_our_previous_publications_(Fierer_et_al.,_2008;_Hamady_et_al.,_2008)._Briefly,_sequences_<200_or_>300gnt_or_with_average_quality_scores_of_<25_were_removed_from_the_dataset,_as_were_those_with_uncorrectable_barcodes,_ambiguous_bases,_or_if_the_bacterial_16S_rRNA_gene-specific_primer_was_absent._Sequences_were_then_assigned_to_the_specific_subsamples_based_on_their_unique_12nt_barcode_and_then_grouped_into_phylotypes_at_the_97%_level_of_sequence_identity_using_cd-hit_(Li_&_Godzik,_2006)_with_a_minimum_coverage_of_97%._We_chose_to_group_the_phylotypes_at_97%_identity_because_this_matches_the_limits_of_resolution_of_pyrosequencing_(Kunin_et_al.,_2010)_and_because_the_branch_length_so_omitted_contributes_little_to_the_tree_and_therefore_to_phylogenetic_estimates_of___diversity_(Hamady_et_al.,_2009)._A_representative_for_each_phylotype_was_chosen_by_selecting_the_most_abundant_sequence_in_the_phylotype,_with_ties_being_broken_by_choosing_the_longest_sequence._A_phylogenetic_tree_of_the_representative_sequences_was_constructed_using_the_Kimura_2-parameter_model_in_Fast_Tree_(Price_et_al.,_2009)_after_sequences_were_aligned_with_NAST_(minimum_150nt_at_75%_minimum_identity)_(DeSantis_et_al.,_2006a)_against_the_GreenGenes_database_(DeSantis_et_al.,_2006b)._Hypervariable_regions_were_screened_out_of_the_alignment_using_PH_Lane_mask_(http://greengenes.lbl.gov/)._Differences_in_the_community_composition_for_each_pair_of_samples_were_determined_from_the_phylogenetic_tree_using_the_weighted_and_unweighted_UniFrac_algorithms_(Lozupone_&_Knight,_2005;_Lozupone_et_al.,_2006)._UniFrac_is_a_tree-based_metric_that_measures_the_distance_between_two_communities_as_the_fraction_of_branch_length_in_a_phylogenetic_tree_that_is_unique_to_one_of_the_communities_(as_opposed_to_being_shared_by_both)._This_method_of_community_comparison_accounts_for_the_relative_similarities_and_differences_among_phylotypes_(or_higher_taxa)_rather_than_treating_all_taxa_at_a_given_level_of_divergence_as_equal_(Lozupone_&_Knight,_2008)._Although_UniFrac_depends_on_a_phylogenetic_tree,_it_is_relatively_robust_to_differences_in_the_tree_reconstruction_method_or_to_the_approximation_of_using_phylotypes_to_represent_groups_of_very_similar_sequences_(Hamady_et_al.,_2009).	CA	FLX	0	CCME	8/14/08	FFCKVMW	1, swab	CCME	pyrosequencing	CCME	16S rRNA	V2	unknown	years	0	U3Space217	False	unknown	unknown	unknown	7/15/08	GAZ:United States of America	0	True	1624	ENVO:surface	ENVO:surface	ENVO:surface	True	U3	36244	40.0083	-105.2705	False	fierer_forensic_keyboard	completed	Space_bar	unknown	408169	Forensic_identification_using_skin_bacterial_communities	Space_bar
