Acta mathematica scientia,Series B ›› 2021, Vol. 41 ›› Issue (3): 1017-1022.doi: 10.1007/s10473-021-0323-x

• Articles • Previous Articles    

ANALYSIS OF THE GENOMIC DISTANCE BETWEEN BAT CORONAVIRUS RATG13 AND SARS-COV-2 REVEALS MULTIPLE ORIGINS OF COVID-19

Shaojun PEI1, Stephen S. -T. YAU1,2   

  1. 1. Department of Mathematical Sciences, Tsinghua University, Beijing 100084, China;
    2. Yanqi Lake Beijing Institute of Mathematical Sciences and Applications, Beijing 101408, China
  • Received:2021-02-25 Revised:2021-03-10 Online:2021-06-25 Published:2021-06-07
  • Contact: Stephen S. -T. YAU E-mail:yau@uic.edu
  • Supported by:
    This work was supported by Tsinghua University Spring Breeze Fund (2020Z99CFY044), Tsinghua University start-up fund, and Tsinghua University Education Foundation fund (042202008).

Abstract: The severe acute respiratory syndrome COVID-19 was discovered on December 31, 2019 in China. Subsequently, many COVID-19 cases were reported in many other countries. However, some positive COVID-19 samples had been reported earlier than those officially accepted by health authorities in other countries, such as France and Italy. Thus, it is of great importance to determine the place where SARS-CoV-2 was first transmitted to human. To this end, we analyze genomes of SARS-CoV-2 using k-mer natural vector method and compare the similarities of global SARS-CoV-2 genomes by a new natural metric. Because it is commonly accepted that SARS-CoV-2 is originated from bat coronavirus RaTG13, we only need to determine which SARS-CoV-2 genome sequence has the closest distance to bat coronavirus RaTG13 under our natural metric. From our analysis, SARS-CoV-2 most likely has already existed in other countries such as France, India, Netherland, England and United States before the outbreak at Wuhan, China.

Key words: SARS-CoV-2, multiple origins of COVID-19, mathematical genomic distance, k-mer natural vector

CLC Number: 

  • 92-08
Trendmd