• Choice of assembly software has a critical impact on virome characterisation

      Sutton, Thomas D S; Clooney, Adam G; Ryan, Feargal J; Ross, R Paul; Hill, Colin; Science Foundation Ireland; European Regional Development Fund; Janssen Biotech, Inc.; SFI/12/RC/2273; SFI/14/SP APC/B3032 (Biomed Central, 2019-01-28)
      Background The viral component of microbial communities plays a vital role in driving bacterial diversity, facilitating nutrient turnover and shaping community composition. Despite their importance, the vast majority of viral sequences are poorly annotated and share little or no homology to reference databases. As a result, investigation of the viral metagenome (virome) relies heavily on de novo assembly of short sequencing reads to recover compositional and functional information. Metagenomic assembly is particularly challenging for virome data, often resulting in fragmented assemblies and poor recovery of viral community members. Despite the essential role of assembly in virome analysis and difficulties posed by these data, current assembly comparisons have been limited to subsections of virome studies or bacterial datasets. Design This study presents the most comprehensive virome assembly comparison to date, featuring 16 metagenomic assembly approaches which have featured in human virome studies. Assemblers were assessed using four independent virome datasets, namely, simulated reads, two mock communities, viromes spiked with a known phage and human gut viromes. Results Assembly performance varied significantly across all test datasets, with SPAdes (meta) performing consistently well. Performance of MIRA and VICUNA varied, highlighting the importance of using a range of datasets when comparing assembly programs. It was also found that while some assemblers addressed the challenges of virome data better than others, all assemblers had limitations. Low read coverage and genomic repeats resulted in assemblies with poor genome recovery, high degrees of fragmentation and low-accuracy contigs across all assemblers. These limitations must be considered when setting thresholds for downstream analysis and when drawing conclusions from virome data.
    • Reproducible protocols for metagenomic analysis of human faecal phageomes

      Shkoporov, Andrey N; Ryan, Feargal J; Draper, Lorraine A.; Forde, Amanda; Stockdale, Stephen R.; Daly, Karen M.; McDonnell, Siobhan A.; Nolan, James A.; Sutton, Thomas D S; Dalmasso, Marion; et al. (Biomed Central, 2018-04-10)
      Background Recent studies have demonstrated that the human gut is populated by complex, highly individual and stable communities of viruses, the majority of which are bacteriophages. While disease-specific alterations in the gut phageome have been observed in IBD, AIDS and acute malnutrition, the human gut phageome remains poorly characterised. One important obstacle in metagenomic studies of the human gut phageome is a high level of discrepancy between results obtained by different research groups. This is often due to the use of different protocols for enriching virus-like particles, nucleic acid purification and sequencing. The goal of the present study is to develop a relatively simple, reproducible and cost-efficient protocol for the extraction of viral nucleic acids from human faecal samples, suitable for high-throughput studies. We also analyse the effect of certain potential confounding factors, such as storage conditions, repeated freeze-thaw cycles, and operator bias on the resultant phageome profile. Additionally, spiking of faecal samples with an exogenous phage standard was employed to quantitatively analyse phageomes following metagenomic sequencing. Comparative analysis of phageome profiles to bacteriome profiles was also performed following 16S rRNA amplicon sequencing. Results Faecal phageome profiles exhibit an overall greater individual specificity when compared to bacteriome profiles. The phageome and bacteriome both exhibited moderate change when stored at + 4 °C or room temperature. Phageome profiles were less impacted by multiple freeze-thaw cycles than bacteriome profiles, but there was a greater chance for operator effect in phageome processing. The successful spiking of faecal samples with exogenous bacteriophage demonstrated large variations in the total viral load between individual samples. Conclusions The faecal phageome sequencing protocol developed in this study provides a valuable additional view of the human gut microbiota that is complementary to 16S amplicon sequencing and/or metagenomic sequencing of total faecal DNA. The protocol was optimised for several confounding factors that are encountered while processing faecal samples, to reduce discrepancies observed within and between research groups studying the human gut phageome. Rapid storage, limited freeze-thaw cycling and spiking of faecal samples with an exogenous phage standard are recommended for optimum results.