The emergence of a global pandemic, COVID-19 is caused by the newly
identified SARS-CoV-2. The current situation warrants us to understand the molecular
basis of the evolution of this emerging pathogen. In this context, we conducted a
comparative codon-based characterization of the viruses within the species Severe
acute respiratory syndrome-related coronavirus (SARSr-CoV). We attempted
phylogenetic analysis, and codon-based characterization by employing selection
pressure and Shannon entropy analyses in the S2 subunit gene sequences of SARSCoV,
Bat-SL-CoV and SARS-CoV-2. Further, the pattern of N-linked/O-linked
glycosylation was analyzed within the SARS-CoV species. The phylogenetic analysis
and pairwise distance calculations showed high similarities in the S2 subunit of SARSCoV-
2 with Bat-SL-CoVs. Our findings uncovered the low mean value of dN/dS,
suggesting purifying selection, but certain codon positions were found to be under
positive selection. The entropy analyses showed 71 codon positions having its high
score. Three codon positions (160, 244 and 562) were identified to be positively
selected with high entropy value suggesting that they are more prone to mutations.
Further, the analysis revealed a conserved pattern in N-linked glycosylation though the
discrepancies were found within the O-linked glycosylation pattern. Our findings may
help in predicting the signature sequences based on the codon-based model of
molecular evolution. Further, this approach may provide information on the
evolutionary dynamics of this pathogen, facilitating much-desired control strategies
against COVID-19.
Keywords: Bat-SL-CoV, Coronavirus, N-linked and O-linked glycosylation,
SARS-CoV, SARS-CoV-2, Selection pressure analysis, Shannon entropy
analysis.