- Authors: Abecasis A, Cuypers L, Libin P, Theys K, Vandamme AM
- Publication Year: 2017
- Journal: Tropical Medicine & International Health
During the last years, an increasing number of Zika virus (ZIKV) infections has been reported in Asia and America. Due to socio-environmental factors that remain to be established, this arbovirus has explosively spread to more than 60 countries worldwide. ZIKV infections can result into severe clinical manifestations, with little knowledge available to date about how viral genetic variability affects disease outcome.
A detailed analysis of the genome-wide diversity and selective pressure will provide new insights to identify specific virus variants that represent threats for public health, essential for the design and further development of diagnostics, therapeutics and vaccines.
A dataset of 153 full-length genome sequences was gathered from Genbank. A pairwise alignment tool, that is able to automatically annotate the codon-correct aligned sequence dataset, was developed. Genotypes were assessed using the Arbovirus-Genotyping Tool (http://www.bioafrica.net/software.php) and manual phylogenetic analysis applying a maximum-likelihood approach (GTR+Γ). Nucleotide and amino acid diversity across the genome were mapped and quantified, and selective pressure was identified using the FUBAR method in HyPhy.
The pairwise codon aware alignment tool served as a pre-processing tool to study genetic diversity in ZIKV full-genome and individual protein data. Phylogenetic reconstruction showed the evolution of ZIKV into two divergent genotypes, with all but two strains assigned to the Asian genotype, consistent with the Arbovirus-Genotyping Tool. Overall, 0.7% of all genome positions were characterized by amino acid diversity higher than 5%, with the highest nucleotide variability observed for proteins C, M, peptide 2K and NS5. Selective pressure analysis could not detect any site in the full-genome to be under positive selective pressure, while 83.9% of all amino acid positions experienced negative selective pressure (posterior probability >0.90). Proteins C and NS5 had the lowest number of positions under negative selective pressure (<75%).
While ZIKV has a high nucleotide variability, a high proportion of amino acid sites is conserved, with no signal for positive selective pressure. This diversity map and selective pressure analysis shows that protein NS5 is characterized by a high nucleotide and amino acid diversity, making it not the best candidate as diagnostic, vaccine or antiviral target.