A Practical Comparison of Short- and Long-Read Metabarcoding Sequencing: Challenges and Solutions for Plastid Read Removal and Microbial Community Exploration of Seaweed Samples.

Journal: Molecular Ecology Resources
Published:
Abstract

Short-read metabarcoding analysis is the gold standard for accessing partial 16S and ITS genes with high read quality. With the advent of long-read sequencing, the amplification of full-length target genes is possible, but with low read accuracy. Moreover, 16S rRNA gene amplification in seaweed results in a large proportion of plastid reads, which are directly or indirectly derived from cyanobacteria. Primers designed not to amplify plastid sequences are available for short-read sequencing, while Oxford Nanopore Technology (ONT) offers adaptive sampling, a unique way to remove reads in real time. In this study, we compare three options to address the issue of plastid reads: deleting plastid reads with adaptive sampling, using optimised primers with Illumina MiSeq technology, and sequencing large numbers of reads with Illumina NovaSeq technology with universal primers. We show that adaptive sampling using the default settings of the MinKNOW software was ineffective for plastid depletion. NovaSeq sequencing with universal primers stood out with its deep coverage, low error rate, and ability to include both eukaryotes and bacteria in the same sequencing run, but it had limitations regarding the identification of fungi. The ONT sequencing helped us explore the fungal diversity and allowed for the retrieval of taxonomic information for genera poorly represented in the sequence databases. We also demonstrated with a mock community that the SAMBA workflow provided more accurate taxonomic assignment at the bacterial genus level than the IDTAXA and KRAKEN2 pipelines, but many false positives were generated at the species level.

Authors
Coralie Rousseau, Nicolas Henry, Sylvie Rousvoal, Gwenn Tanguy, Erwan Legeay, Catherine Leblanc, Simon Dittami