SGV-caller: SARS-CoV-2 genome variation caller.

Journal: Heliyon
Published:
Abstract

Given the pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), continuous analysis of its genomic variations at the nucleotide level is imperative to monitor the emergence of novel variants of concern. The Global Initiative on Sharing All Influenza Data (GISAID) serves as the de facto standard database for the genomic information of SARS-CoV-2. However, limitations of its data-sharing policy hinder the comprehensive analysis of genomic variations. To address this problem, we developed SGV-caller, a bioinformatics pipeline for analyzing the frequently updated GISAID database. SGV-caller compares input datasets with pre-existing databases and generates local databases encompassing nucleotide, amino acid, and codon-level genomic variations for each SARS-CoV-2 genome. Furthermore, SGV-caller accommodates SARS-CoV-2 genomes from non-GISAID sources as well as other viral genomes. SGV-caller source code and test data are available at https://github.com/wujiaqi06/SGV-caller.

Authors