Skip to main content

Advertisement

Analysis of the salivary microbiome using culture-independent techniques

Abstract

Background

The salivary microbiota is a potential diagnostic indicator of several diseases. Culture-independent techniques are required to study the salivary microbial community since many of its members have not been cultivated.

Methods

We explored the bacterial community composition in the saliva sample using metagenomic whole genome shotgun (WGS) sequencing, the extraction of 16S rRNA gene fragments from metagenomic sequences (16S-WGS) and high-throughput sequencing of PCR-amplified bacterial 16S rDNA gene (16S-HTS) regions V1 and V3.

Results

The hierarchical clustering of data based on the relative abundance of bacterial genera revealed that distances between 16S-HTS datasets for V1 and V3 regions were greater than those obtained for the same V region with different numbers of PCR cycles. Datasets generated by 16S-HTS and 16S-WGS were even more distant. Finally, comparison of WGS and 16S-based datasets revealed the highest dissimilarity.

The analysis of the 16S-HTS, WGS and 16S-WGS datasets revealed 206, 56 and 39 bacterial genera, respectively, 124 of which have not been previously identified in salivary microbiomes. A large fraction of DNA extracted from saliva corresponded to human DNA. Based on sequence similarity search against completely sequenced genomes, bacterial and viral sequences represented 0.73% and 0.0036% of the salivary metagenome, respectively. Several sequence reads were identified as parts of the human herpesvirus 7.

Conclusions

Analysis of the salivary metagenome may have implications in diagnostics e.g. in detection of microorganisms and viruses without designing specific tests for each pathogen.

Background

The microbiota in the mouth has a significant impact on both the oral and general health. Bacterial species associated with periodontal health and those that are more prevalent in periodontal disease have been identified [1]. The salivary microbiota is a potential diagnostic indicator of several diseases. For instance, a caries-free oral status in children is associated with a significant shift in the relative abundance of Porphyromonas catoniae and Neisseria flavescens in saliva [2]. Increased salivary counts of Capnocytophaga gingivalis, Prevotella melaninogenica and Streptococcus mitis are associated with oral cancer [3]. The salivary level of the bacterium Selenomonas noxia correlates with obesity in women [4].

The study of the oral microbiota as well as its salivary component requires culture-independent techniques, since about one third of 700 bacterial species identified in the human oral cavity have not been cultivated [5]. These may be based on PCR amplification and high-throughput sequencing of the bacterial 16S rRNA genes (16S-HTS) or the metagenomic whole genome shotgun (WGS) sequencing. The latter approach may include either the analysis of the totality of generated DNA fragments or of the 16S rRNA gene fragments retrieved from the metagenome (16S-WGS) [6]. Both 16S-WGS and 16S-HTS approaches present limitations and advantages over each other [6].

Here we explored the microbial community composition in the saliva sample using WGS, 16S-WGS and 16S-HTS. In addition, to assess putative biases due to PCR amplification, we compared taxonomic composition of 16S-HTS datasets obtained after different number of PCR cycles.

Methods

Sampling

The study was conducted according to the current version of Declaration of Helsinki and approved by the Ethics Committee of HUG (09-078). Unstimulated saliva was obtained with informed consent from a 32-year male smoker without obvious signs of oral disease. The sample was collected by spitting in a sterile plastic 50-mL tube at 10:30 a.m., 1.5 hours after eating. Six hundred μL saliva was mixed with the same volume of 2x lysis buffer [Tris 20 mM, EDTA 2 mM (pH 8), Tween 1%] and Proteinase K (Eurobio) 200 μg/mL. After a 2.5 hour incubation at 55°C, proteinase K was inactivated by a 10-min heating at 95°C. The saliva lysate was divided in six 200-μL aliquots to which RnaseA (Roche) 40 μg/mL was added. Samples were incubated for 5 min at room temperature. From that point, the DNeasy Blood & Tissue Kit (QIAGEN) was used following the manufacturer's Spin-Column Protocol for Purification of Total DNA from Animal Blood or Cells (DNeasy Blood & Tissue Handbook 07/2006). DNA was eluted using 110 μL of supplied AE Buffer, then the pooled eluate (metagenomic DNA) was concentrated to 80 ng/μL. Total DNA quantity was assessed using a NanoDrop ND-8000 spectrophotometer (NanoDrop Technologies).

PCR and sequencing

PCR amplification was carried out in a 50-μL PrimeStar HS Premix (Takara) containing 8 ng of purified DNA and 0.5 μM of each forward and reverse primer. The 16S rDNA V1-3 amplicons generated with primers 5'-GAGTTTGATCMTGGCTCAG (V1 forward) and 5'-CCGCGRCTGCTGGCAC (V3 reverse) corresponded to E. coli positions 28 to 514 after exclusion of primers sequences. The samples were run in four replicate PCRs for 20, 25 or 30 cycles using the following parameters: 98°C for 10 s, 60°C for 15 s, and 72°C for 1 min. The four replicate PCRs were then pooled.

Paired-end DNA libraries were prepared according to the manufacturer's (Illumina) instructions. Metagenomic DNA fragments of about 300 bp and 16S rDNA amplicons were barcoded using specific 6-base sequences. The libraries were sequenced from both ends for 100 cycles (excluding barcode sequences) on the Illumina Hi-Seq 2000 using TruSeq SBS v5 kit. A barcoded PhiX reference was spiked in the same channel to estimate the error rate.

Sequence filtering

Parameters of the initial quality filter were the following: (i) maximum one base below a quality of 5 in the first 70 bases; (ii) a minimum average quality of 10; (iii) no ambiguous base allowed. After filtering, the average Q30 was larger than 75% and the average PhiX error rate was 0.7%. Only pairs were retained in the filtered data, i.e. if one read was filtered out the paired read was removed. Each of the three 16S-HTS datasets (20, 25 and 30 PCR cycles) was reduced by randomly picking 1.2 million sequence read pairs. Then, in the second filtering step, we removed sequences containing incorrect PCR primer sequences or runs of ≥ 12 identical nucleotides. The WGS dataset was reduced to one million sequence pairs and was not subject to additional filtering steps. Sequences were deposited in MG-RAST under accession numbers 4477823.3, 4477824.3, 4477839.3, 4477840.3, 4478078.3, 4478079.3, 4478080.3, 4478370.3, 4478371.3, 4479520.3, 4479521.3, 4479522.3, 4479523.3 and 4479524.3.

Analysis

The 16S rDNA sequences were clustered to operational taxonomic units (OTUs) defined at 95% identity using CD-HIT [7]. The V1 and V3 sequences were assigned the taxonomic identity using the Ribosomal Database Project (RDP) Classifier [8] with a recommended 50% confidence cutoff. Taxonomic assignments of sequences from the WGS dataset were made using BLASTN [9] against NCBI prokaryotic, viral and fungal databases as well as against the human sequences from NCBI and EBI databases. The criteria used were a wordsize of 16, ≥ 94% identity, ≥ 90 overlap and e-value ≤10-30. The bacterial 16S rDNA sequences were extracted from the WGS dataset using CAMERA [10] and HMMER search option. They were then filtered using an e value ≤10-10 and assigned to genera using the RDP Classifier.

Group-average clustering of data was performed using a Bray-Curtis similarity matrix in PRIMER-E (Plymouth), based on square-root-transformed genera abundance.

Results and Discussion

Illumina sequencing

We explored the microbial community composition in the saliva sample from a male healthy adult using Illumina sequencing. The 100-base paired reads from the whole metagenome fragments as well as 81-base V1 and 84-base V3 reads of 16S rDNA amplicons were analyzed (Table 1). Data from the forward and reverse run for each pool of DNA fragments were analyzed separately since it has been reported that reverse Illumina reads are of lower quality than those from the forward run [11].

Table 1 Description of the 4 sequence datasets and 14 subsets

Taxa abundance as a function of the PCR cycle number in the 16S-HTS datasets

Taxa detection and the accuracy of 16S rDNA abundance measurement are affected by the number of PCR cycles used to amplify 16S rDNA from a bacterial community [1214]. To investigate how taxonomic composition of the same sample differed depending on the number of PCR cycles, we analyzed forward and reverse reads from V1 and V3 16S-HTS datasets obtained after 20, 25 and 30 cycles. The taxonomic assignment was performed using RDP Classifier (Additional file 1). The changes in proportions were relatively consistent for the different taxa within the same phylum (Figure 1): the relative abundance of sequence reads assigned to taxa belonging to the phyla Actinobacteria and Firmicutes generally decreased with more PCR cycles, whereas the proportion of sequences representing other phyla and their corresponding lower-level taxa generally increased. Instances in which the same direction of change (decrease or increase) occurred across all 8 subsets were found in 63 of the 109 taxa (Figure 1). Moreover, in 95% of cases where a > 25% change in taxa abundance was found in 30- vs 20- cycle-samples, the value obtained after 25 cycles was intermediate relatively to those obtained after 20 and 30 cycles. It seems likely that, for some taxa, more PCR cycles will further increase the bias. We found that the average Firmicutes to Bacteroidetes ratio, which may be an indicator of obesity in intestinal microbiota [15], was on average 3.6 (range 3.4-3.9), 2.3 (range 2.2-2.4) and 2.0 (range 1.9-2.2) in 20-, 25- and 30-cycle 16S-HTS datasets, respectively.

Figure 1
figure1

Heat map showing changes in taxa proportions as a function of PCR cycle number. The taxa shared by all (twelve) 16S-HTS subsets were analyzed. The relative abundances of taxa after 20 PCR cycles were used as baselines for comparisons. These values are represented according to the grey scale below the heat map. Changes (%) in the relative abundance of taxa after 25 or 30 cycles are represented by rectangles according to the color scale below the heat map. The corresponding values are given in the Additional files 1 and 3. Differences were significant (P < 0.01; chi-square test) unless marked by an asterisk.

There is a concern that short Illumina reads and sequence errors may compromise the quality of taxonomic assignments [1618]. To assess the accuracy of taxonomic assignments we extracted 81-base V1 and 84-base V3 sequences from the 16Sr RNA gene for 660 species from the Human Oral Microbiome Database (HOMD) [5] for which the taxonomic information was available at the genus level. These simulated Illumina reads were assigned taxonomy using the RDP Classifier with a recommended 50% bootstrap cutoff. The proportion of V1 and V3 sequences correctly assigned at the genus level reached 68% and 76%, respectively (Additional File 2). For both, V1 and V3 regions, the accuracy of taxonomic assignment at the phylum level was greater than 95%.

We clustered sequence reads generated by Illumina sequencing into OTUs, defined at ≥ 95% identity, which roughly corresponds to genus-level grouping [19] and may have the effect of absorbing some sequence errors [20]. Then, we compared the OTU content across datasets obtained after different number of PCR cycles using the phylum-level affiliation of representative OTUs derived from the RDP Classifier. The OTUs that met the criteria described in Additional file 3 were selected for comparisons. This approach confirmed the trend observed when taxonomy was assigned to each sequence read (see above); the relative abundance of the majority of OTUs from the phyla Actinobacteria and Firmicutes was decreased whereas the proportion of most OTUs from the phyla Bacteroidetes, Fusobacteria and Spirochaetes was enhanced by increasing the number of PCR cycles (Additional file 3).

Therefore, performing more PCR cycles, which may be required when little template DNA is available, may introduce amplification biases and increase the distance from samples for which less PCR cycles were performed.

Taxonomic assignment in the WGS datasets

In the WGS approach, the taxonomic assignments were inferred from BLASTN searches of individual sequence reads. Sequences were compared to human genomic sequence as well as to databases containing completely sequenced prokaryotic, viral and fungal genomes. Most of the BLASTN hits corresponded to human DNA, whereas bacterial and viral sequences represented 0.73% and 0.0036% respectively (Table 2). Forward reads performed better in terms of assignment yield to these three categories, reflecting a higher sequence quality in comparison to the reverse reads [11]. A total of 369 and 367 16S rRNA gene fragments were extracted from WGS forward and reverse datasets, respectively, using CAMERA.

Table 2 Number of BLATSN hits against human, bacterial and viral databases

Comparison of the 16S-HTS and WGS datasets

We performed hierarchical clustering of WGS and 16S datasets based on the relative abundance of bacterial genera. Before computing the similarities, we applied a square-root-transformation of the relative abundance data in order to equilibrate the impact of abundant and rare genera. The resulting dendrogram (Figure 2) shows two main clusters that correspond to WGS and 16S-based datasets. The latter is divided into 2 subclusters corresponding to 16S-HTS and 16S-WGS approach. The 16S-HTS subcluster further splits according to the V region sequenced. Finally, for a given V region, datasets generated by sequencing PCR products obtained after 20 amplification cycles were separated from their 25- and 30-cycle counterparts which clustered together. The highest variation between the forward and reverse datasets was found for the 16S-WGS approach, which may be due to a rather small number (< 400) of sequences analyzed in comparison to the two other approaches (Table 1) and the fact that the forward and reverse reads in the 16S-WGS dataset cover different, randomly distributed segments of 16S rRNA genes which may lead to variation in taxonomic assignments.

Figure 2
figure2

Comparison of WGS, 16S-WGS and 16S-HTS datasets to study bacterial community of the saliva sample. Group-average clustering of data was performed using a Bray-Curtis similarity matrix in PRIMER-E (Plymouth), based on square-root-transformed genera abundance. Only genera which occurred at a frequency of > 0.1% in at least one of 14 subsets were included in the analysis. V1 and V3 designate the sequenced hypervariable region of 16S rDNA. 20, 25 and 30 indicate the number of PCR cycles performed. F, forward reads; R, reverse reads.

In order to confirm that the pattern observed is reproducible across individuals, it would be necessary to analyze a larger number of salivary microbiomes using the same methodologies. As a first step towards addressing this issue, we extracted the 84-base V3 16S rDNA sequences from the 5 salivary microbiomes reported in a previous study [21]. The relevant data were then included in the construction of the similarity matrix as described above (not presented). We found that the average similarity between the V3 dataset determined in the current study and the five published microbiomes was comparable to the average similarity observed among these five microbiomes (Table 3). Relative to these values, our V3 16S-HTS datasets showed higher degrees of similarity when compared to our V1 16S-HTS and 16S-WGS datasets (Table 3) i.e. the interindividual differences may outweigh to some extent the methodological differences. Still, differences in the genera distribution inferred from 16S-WGS, 16S-HTS and WGS datasets (Table 3) are too great to allow for reliable comparisons of the results generated using different methodologies. However, as the reference microbial genome database will grow, the WGS and 16S-WGS methodologies will likely provide more closely related data.

Table 3 Similarity between the V3 16S and other datasets

We identified 206 bacterial genera using 16S-HTS, 108 of which have not been previously found in salivary microbiomes using culture-independent techniques [18, 2124]. This was also the case with 19 out of 56 genera determined by WGS, and 6 out of 39 genera identified by 16S-WGS approach. The majority of the new salivary genera (116/124) were found at a frequency < 0.1%, and only 8 occurred at a frequency between 0.1 and 0.74% (Additional files 1 and 4). This suggests that the most abundant bacterial genera in the saliva of healthy subjects have probably already been identified. However, the inventory and dynamics of low-abundance-genera, whose identification requires a deeper sample coverage, remain largely unknown.

Using WGS sequencing, which, in contrast to the 16S-HTS method applied in this study, does not specifically target bacteria, we did not detect archaea in the saliva sample. This is not surprising since the only archaeon identified so far in the human oral cavity i.e. Methanobrevibacter oralis, was found in dental plaques associated with pathological processes [25]. BLASTN similarity search against fungal genomes of the NCBI database did not yield any significant hits. The reasons for this may be: (i) an inefficient disruption of fungal cells by the enzymatic procedure used to release DNA from bacteria; (ii) the presence of fungi in saliva under the detection level; (iii) the absence of the relevant fungal genomes in the database. So far, sequences of only six fungal genera (Zygosaccharomyces, Penicillium, Gibberella, Saccharomyces, Aspergillus, Candida), present in the oral cavity of healthy individuals [26], are available in public databases.

Salivary viriome

Based on BLASTN comparisons to the NCBI virus database, we identified sequences possibly derived from three different eukaryotic viruses. The most abundant (Table 4) was human herpes virus 7 with 17 and 15 sequence reads in the forward and reverse WGS datasets, respectively. A sequence similar to porcine endogenous retrovirus and the virus of the green alga Chlorella were identified as well. Several sequences produced the best BLASTN hits to bacteriophages including Streptococcus phage SM1 and two enterobacterial phages, lambda and phiX174.

Table 4 Detection of viral sequences in the WGS dataset using BLASTN

Clinical applications

Metagenomics has the potential to serve as a viral and bacterial infection control strategy in clinical practice because it can discover known as well as new pathogens, and might soon replace many existing typing methods in diagnostics.

HTS of cDNA has already been successfully applied to the detection of new viral pathogens in human serum and liver as well as in the reconstruction of viral genomes [2729]. Similarly, WGS of a patient's feces samples detected the bacterial pathogen Campylobacter jejuni during but not after an acute diarrheal episode [30].

At least six double-stranded DNA human herpes viruses (HHV) i.e. Herpes simplex virus 1, Epstein-Barr virus, cytomegalovirus and human herpesviruses 6, 7 and 8 have been detected in saliva using sensitive PCR assays [31]. These viruses are shed in saliva asymptomatically which could facilitate their transmission. Most human adults are infected with HHVs but the prevalence of some HHV is significantly higher in HIV-seropositive persons. In subjects with recurrent oral Herpes simplex virus 1 infections, two other HHVs, HHV-6 and HHV-7 were simultaneously present with a frequency of over 93% [31]. HHvs possibly contribute to periodontitis which, in turn, facilitates virus shed into saliva [32]. Recently, using a metagenomic approach Willner et al. [33] identified Epstein-Barr virus in a pool of oropharyngeal swabs from 19 individuals.

In our study WGS sequencing applied on the salivary metagenome allowed identification of sequences showing the best similarity to the human herpesvirus 7 as well as to the putative periodontopathic bacteria Porphyromonas gingivalis, Treponema denticola and Aggregatibacter actinomycetemcomitans [34].

The exact role of bacteria and viruses in periodontitis and other oral diseases is not elucidated. It has been hypothesized that bacteria and viruses cooperate to provoke the disease [35]. The detection of periodontopathic agents is important because periodontitis has been associated with other health problems such as cardiovascular diseases, premature delivery, rheumatoid arthritis and cancer [35].

Although a large fraction of DNA extracted from saliva corresponds to human DNA, we estimate that, at a coverage consisting of a hundred million sequences, which is the current capacity per channel on the Illumina platform, hundreds of thousands of bacterial sequences and thousands of viral and phage sequences may be identified. Therefore, detection of viruses by metagenomic sequencing is possible even without including filtration and concentration steps, although these procedures are effective in enriching the metagenomic samples for viral DNA [33]. In addition, tens of thousands of 16S rDNA sequences, free of amplification anomalies, may be extracted from huge WGS datasets and used to assess taxonomic composition of bacterial communities.

Conclusions

Analysis of the salivary microbiome is not only of interest from a fundamental perspective, but may have implications in diagnostics e.g. in detection of viruses and microorganisms without including specific tests for each pathogen.

In our study, WGS sequencing compared to 16S-HTS generated a higher fraction of taxonomically unassigned non-human sequences because of the lack of homologs in sequence databases. Using relatively stringent BLASTN parameters about 19% and 35% of sequence reads remained taxonomically unassigned in the forward- and reverse-run WGS subsets, respectively. Nevertheless, the advantage of the WGS approach is that it allows assessment of not only bacterial but also viral (human viruses and phages) and possibly fungal and archaeal communities which undoubtedly play an important role in oral health or disease. In addition, an in-depth sequencing of a salivary metagenome may provide insights into gene functions and allow for reconstruction of the functional potential of a microbial population [36]. Functional assignments of sequences may be made for instance using CAMERA, CARMA3 [37] or MG-RAST [38], as it was recently shown for the supragingival dental plaque microbome [39]. The obtained sequences may be assigned to known functions and classified to major categories including, among others, virulence and resistance to antibiotics.

Abbreviations

HHV:

human herpes virus

HTS:

high throughput sequencing

OTU:

operational taxonomic unit

WGS:

whole genome shotgun.

References

  1. 1.

    Colombo AP, Boches SK, Cotton SL, Goodson JM, Kent R, Haffajee AD, Socransky SS, Hasturk H, Van Dyke TE, Dewhirst F, Paster BJ: Comparisons of subgingival microbial profiles of refractory periodontitis, severe periodontitis, and periodontal health using the human oral microbe identification microarray. J Periodontol. 2009, 80: 1421-1432. 10.1902/jop.2009.090185.

  2. 2.

    Crielaard W, Zaura E, Schuller AA, Huse SM, Montijn RC, Keijser BJ: Exploring the oral microbiota of children at various developmental stages of their dentition in the relation to their oral health. BMC Med Genomics. 2011, 4: 22-10.1186/1755-8794-4-22.

  3. 3.

    Mager DL, Haffajee AD, Devlin PM, Norris CM, Posner MR, Goodson JM: The salivary microbiota as a diagnostic indicator of oral cancer: a descriptive, non-randomized study of cancer-free and oral squamous cell carcinoma subjects. J Transl Med. 2005, 3: 27-10.1186/1479-5876-3-27.

  4. 4.

    Goodson JM, Groppo D, Halem S, Carpino E: Is obesity an oral bacterial disease?. J Dent Res. 2009, 88: 519-523. 10.1177/0022034509338353.

  5. 5.

    Chen T, Yu WH, Izard J, Baranova OV, Lakshmanan A, Dewhirst FE: The Human Oral Microbiome Database: a web accessible resource for investigating oral microbe taxonomic and genomic information. Database (Oxford). 2010, 2010: baq013-

  6. 6.

    Shah N, Tang H, Doak TG, Ye Y: Comparing bacterial communities inferred from 16S rRNA gene sequencing and shotgun metagenomics. Pac Symp Biocomput. 2011, 165-176.

  7. 7.

    Huang Y, Niu B, Gao Y, Fu L, Li W: CD-HIT Suite: a web server for clustering and comparing biological sequences. Bioinformatics. 2010, 26: 680-682. 10.1093/bioinformatics/btq003.

  8. 8.

    Wang Q, Garrity GM, Tiedje JM, Cole JR: Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microbiol. 2007, 73: 5261-5267. 10.1128/AEM.00062-07.

  9. 9.

    Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol. 1990, 215: 403-410.

  10. 10.

    Seshadri R, Kravitz SA, Smarr L, Gilna P, Frazier M: CAMERA: a community resource for metagenomics. PLoS Biol. 2007, 5: e75-10.1371/journal.pbio.0050075.

  11. 11.

    Claesson MJ, O'Sullivan O, Wang Q, Nikkila J, Marchesi JR, Smidt H, de Vos WM, Ross RP, O'Toole PW: Comparative analysis of pyrosequencing and a phylogenetic microarray for exploring microbial community structures in the human distal intestine. PLoS One. 2009, 4: e6669-10.1371/journal.pone.0006669.

  12. 12.

    Paliy O, Foy BD: Mathematical modeling of 16S ribosomal DNA amplification reveals optimal conditions for the interrogation of complex microbial communities with phylogenetic microarrays. Bioinformatics. 2011, 27: 2134-2140. 10.1093/bioinformatics/btr326.

  13. 13.

    Sipos R, Szekely AJ, Palatinszky M, Revesz S, Marialigeti K, Nikolausz M: Effect of primer mismatch, annealing temperature and PCR cycle number on 16S rRNA gene-targetting bacterial community analysis. FEMS Microbiol Ecol. 2007, 60: 341-350. 10.1111/j.1574-6941.2007.00283.x.

  14. 14.

    Suzuki M, Rappe MS, Giovannoni SJ: Kinetic bias in estimates of coastal picoplankton community structure obtained by measurements of small-subunit rRNA gene PCR amplicon length heterogeneity. Appl Environ Microbiol. 1998, 64: 4522-4529.

  15. 15.

    Ley RE, Turnbaugh PJ, Klein S, Gordon JI: Microbial ecology: human gut microbes associated with obesity. Nature. 2006, 444: 1022-1023. 10.1038/4441022a.

  16. 16.

    Claesson MJ, Wang Q, O'Sullivan O, Greene-Diniz R, Cole JR, Ross RP, O'Toole PW: Comparison of two next-generation sequencing technologies for resolving highly complex microbiota composition using tandem variable 16S rRNA gene regions. Nucleic Acids Res. 2010, 38: e200-10.1093/nar/gkq873.

  17. 17.

    Degnan PH, Ochman H: Illumina-based analysis of microbial community diversity. ISME J. 2012, 6: 183-194. 10.1038/ismej.2011.74.

  18. 18.

    Lazarevic V, Whiteson K, Huse S, Hernandez D, Farinelli L, Østerås M, Schrenzel J, François P: Metagenomic study of the oral microbiota by Illumina high-throughput sequencing. J Microbiol Methods. 2009, 79: 266-271. 10.1016/j.mimet.2009.09.012.

  19. 19.

    Schloss PD, Handelsman J: Introducing DOTUR, a computer program for defining operational taxonomic units and estimating species richness. Appl Environ Microbiol. 2005, 71: 1501-1506. 10.1128/AEM.71.3.1501-1506.2005.

  20. 20.

    Kunin V, Engelbrektson A, Ochman H, Hugenholtz P: Wrinkles in the rare biosphere: pyrosequencing errors can lead to artificial inflation of diversity estimates. Environ Microbiol. 2010, 12: 118-123. 10.1111/j.1462-2920.2009.02051.x.

  21. 21.

    Lazarevic V, Whiteson K, Hernandez D, François P, Schrenzel J: Study of inter- and intra-individual variations in the salivary microbiota. BMC Genomics. 2010, 11: 523-10.1186/1471-2164-11-523.

  22. 22.

    Keijser BJ, Zaura E, Huse SM, van der Vossen JM, Schuren FH, Montijn RC, ten Cate JM, Crielaard W: Pyrosequencing analysis of the oral microflora of healthy adults. J Dent Res. 2008, 87: 1016-1020. 10.1177/154405910808701104.

  23. 23.

    Nasidze I, Li J, Quinque D, Tang K, Stoneking M: Global diversity in the human salivary microbiome. Genome Res. 2009, 19: 636-643. 10.1101/gr.084616.108.

  24. 24.

    Zaura E, Keijser BJ, Huse SM, Crielaard W: Defining the healthy "core microbiome" of oral microbial communities. BMC Microbiol. 2009, 9: 259-10.1186/1471-2180-9-259.

  25. 25.

    Lepp PW, Brinig MM, Ouverney CC, Palm K, Armitage GC, Relman DA: Methanogenic Archaea and human periodontal disease. Proc Natl Acad Sci USA. 2004, 101: 6176-6181. 10.1073/pnas.0308766101.

  26. 26.

    Ghannoum MA, Jurevic RJ, Mukherjee PK, Cui F, Sikaroodi M, Naqvi A, Gillevet PM: Characterization of the oral fungal microbiome (mycobiome) in healthy individuals. PLoS Pathog. 2010, 6: e1000713-10.1371/journal.ppat.1000713.

  27. 27.

    Briese T, Paweska JT, McMullan LK, Hutchison SK, Street C, Palacios G, Khristova ML, Weyer J, Swanepoel R, Egholm M, Nichol ST, Lipkin WI: Genetic detection and characterization of Lujo virus, a new hemorrhagic fever-associated arenavirus from southern Africa. PLoS Pathog. 2009, 5: e1000455-10.1371/journal.ppat.1000455.

  28. 28.

    Palacios G, Druce J, Du L, Tran T, Birch C, Briese T, Conlan S, Quan PL, Hui J, Marshall J, Simons JF, Egholm M, Paddock CD, Shieh WJ, Goldsmith CS, Zaki SR, Catton M, Lipkin WI: A new arenavirus in a cluster of fatal transplant-associated diseases. N Engl J Med. 2008, 358: 991-998. 10.1056/NEJMoa073785.

  29. 29.

    Towner JS, Sealy TK, Khristova ML, Albarino CG, Conlan S, Reeder SA, Quan PL, Lipkin WI, Downing R, Tappero JW, Okware S, Lutwama J, Bakamutumaho B, Kayiwa J, Comer JA, Rollin PE, Ksiazek TG, Nichol ST: Newly discovered ebola virus associated with hemorrhagic fever outbreak in Uganda. PLoS Pathog. 2008, 4: e1000212-10.1371/journal.ppat.1000212.

  30. 30.

    Nakamura S, Maeda N, Miron IM, Yoh M, Izutsu K, Kataoka C, Honda T, Yasunaga T, Nakaya T, Kawai J, Hayashizaki Y, Horii T, Iida T: Metagenomic diagnosis of bacterial infections. Emerg Infect Dis. 2008, 14: 1784-1786. 10.3201/eid1411.080589.

  31. 31.

    Miller CS, Avdiushko SA, Kryscio RJ, Danaher RJ, Jacob RJ: Effect of prophylactic valacyclovir on the presence of human herpesvirus DNA in saliva of healthy individuals after dental treatment. J Clin Microbiol. 2005, 43: 2173-2180. 10.1128/JCM.43.5.2173-2180.2005.

  32. 32.

    Slots J, Slots H: Bacterial and viral pathogens in saliva: disease relationship and infectious risk. Periodontol 2000. 2011, 55: 48-69. 10.1111/j.1600-0757.2010.00361.x.

  33. 33.

    Willner D, Furlan M, Schmieder R, Grasis JA, Pride DT, Relman DA, Angly FE, McDole T, Mariella RP, Rohwer F, Haynes M: Metagenomic detection of phage-encoded platelet-binding factors in the human oral cavity. Proc Natl Acad Sci USA. 2011, 108 (Suppl 1): 4547-4553.

  34. 34.

    Umeda M, Chen C, Bakker I, Contreras A, Morrison JL, Slots J: Risk indicators for harboring periodontal pathogens. J Periodontol. 1998, 69: 1111-1118.

  35. 35.

    Slots J: Human viruses in periodontitis. Periodontol 2000. 2010, 53: 89-110. 10.1111/j.1600-0757.2009.00325.x.

  36. 36.

    Dinsdale EA, Edwards RA, Hall D, Angly F, Breitbart M, Brulc JM, Furlan M, Desnues C, Haynes M, Li L, McDaniel L, Moran MA, Nelson KE, Nilsson C, Olson R, Paul J, Brito BR, Ruan Y, Swan BK, Stevens R, Valentine DL, Thurber RV, Wegley L, White BA, Rohwer F: Functional metagenomic profiling of nine biomes. Nature. 2008, 452: 629-632. 10.1038/nature06810.

  37. 37.

    Gerlach W, Stoye J: Taxonomic classification of metagenomic shotgun sequences with CARMA3. Nucleic Acids Res. 2011, 39: e91-10.1093/nar/gkr225.

  38. 38.

    Meyer F, Paarmann D, D'Souza M, Olson R, Glass EM, Kubal M, Paczian T, Rodriguez A, Stevens R, Wilke A, Wilkening J, Edwards RA: The metagenomics RAST server - a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinformatics. 2008, 9: 386-10.1186/1471-2105-9-386.

  39. 39.

    Belda-Ferre P, Alcaraz LD, Cabrera-Rubio R, Romero H, Simon-Soro A, Pignatelli M, Mira A: The oral metagenome in health and disease. ISME J. 2012, 6: 46-56. 10.1038/ismej.2011.85.

  40. 40.

    Cole JR, Wang Q, Cardenas E, Fish J, Chai B, Farris RJ, Kulam-Syed-Mohideen AS, McGarrell DM, Marsh T, Garrity GM, Tiedje JM: The Ribosomal Database Project: improved alignments and new tools for rRNA analysis. Nucleic Acids Res. 2009, 37: D141-145. 10.1093/nar/gkn879.

Download references

Author information

Correspondence to Vladimir Lazarevic.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

VL, KW, PF and JS contributed to the experimental design and drafting the manuscript. VL carried out laboratory procedures. MØ and LF performed high-throughput sequencing. VL, NG, YG and DH performed the bioinformatics and taxonomic analyses. All authors approved the final manuscript.

Electronic supplementary material

Additional file 1: Relative abundance of taxa inferred from 16S-HTS and 16S-WGS datasets. This Excel file lists taxa and their relative abundance. The RDP Classifier with a ≥ 50% confidence cutoff was used to assign taxonomy. (XLS 166 KB)

Additional file 2: Accuracy of taxonomic assignments for 81-base V1 and 84-base V3 16S rDNA sequences. This graph shows the accuracy of taxonomic assignments for 660 HOMD species from 118 genera as determined using the RDP Classifier with a 50% bootstrap threshold. (PDF 216 KB)

Additional file 3: Changes in taxa proportions inferred from 16S-HTS subsets as a function of PCR cycle number. This is a Word file showing changes (%) in the relative abundance of taxa identified after 25 and 30 cycles of PCR. The values obtained after 20 PCR cycles were used as baseline for comparisons. (A) Taxa identified in all (twelve) 16S-HTS subsets were selected for the analysis. (B) 95%-ID OTUs present in six subsets of the specified V region at a frequency > 0.1% in at least one 20-cycle-subset are presented. Changes in the relative abundance of taxa after specified number of cycles: red, increase; blue, decrease. (DOC 500 KB)

Additional file 4: Relative abundance of taxa in the WGS dataset. This is an Excel file with the taxonomic assignments performed using BLASTN and Bergey's taxonomy accessible via the RDP [40]. Positive BLASTN hits values (wordsize of 16, ≥ 94% identity, ≥ 90 overlap and e-value ≤ 10-30) were normalized so that they sum to 100%. In cases where two or more top BLASTN hits were identical, the lowest common ancestor was calculated. (XLS 46 KB)

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Keywords

  • Taxonomic Assignment
  • Whole Genome Shotgun
  • Whole Genome Shotgun Sequencing
  • Ribosomal Database Project Classifier
  • Salivary Microbiota