COVlD-19 Related Research by Data Mining in Single Cell Transcriptome Profiles

2021-03-27 22:33:04

Abstract—The outbreak of coronavirus disease 2019 (COVID-2019) has drawn public attention all over the world.As a newly emerging area,single cell sequencing also exerts its power in the battle over the epidemic.In this review,the up-to-date knowledge of COVID-19 and its receptor is summarized,followed by a collection of the mining of single cell transcriptome profiling data for the information in aspects of the vulnerable cell types in humans and the potential mechanisms of the disease.

1.Introduction

The pneumonia caused by the novel coronavirus attacked Wuhan,China by the end of the year 2019 and has been declared as a Public Health Emergency of International Concern (PHEIC) by the World Health Organization(WHO).The pathogen causing the infection has been identified as a novel kind of coronavirus,later named as coronavirus disease 2019 (COVID-19) virus,which could spread through contacting patients,the contaminated surface,or respiratory droplets in the air.The symptoms of this infection resembles severe acute respiratory syndrome (SARS) and middle east respiratory syndrome (MERS) in aspects of fever,cough,diarrhea,nausea,and even multi-organs failure.

In the battle between humans and infectious epidemics,the newly emerged technique of single cell transcriptome profiling also has its place.Single cell ribonucleic acid sequencing (scRNA-seq) has been developed at an exciting speed and applied in an increasing scale,which offers an unprecedentedly powerful tool to clarify the more detailed mechanism of the biological function as well as disease at the cellular resolution.There is no scRNAseq data from COVID-19 patients up to date,though,as the receptor of COVID-19 has already been identified,the existing single cell transcription profiles in databases can serve as samples of susceptible population for data mining to unveil the vulnerability of different parts of the human body,the route of propagation,and the possible mechanism of this newly emerged disease.

2.Prerequisite of COVlD-19 and lts Receptor in Humans

The knowledge about the COVID-19 virus and its route into the human body has been deciphered rapidly,based on which the mining of scRNA-seq data could find its way into the research of COVID-19.As a self-replicable single-strand RNA virus,its RNA has been sequenced right after the outbreak of the epidemic,with the result showing that the COVID-19 RNA shares 76.47% sequence identity with the SARS coronavirus(SARS-Cov)[1].COVID-19 has been found to invade the human body through its association with angiotensin I converting enzyme 2 (ACE2) by its spike glycoprotein[2],indicating that ACE2 is the putative receptor of COVID-19.ACE2 is a member of the angiotensin-converting enzyme family of dipeptidyl carboxydipeptidases,and it plays a role in the regulation of cardiovascular and renal function,as well as fertility[3].

ACE2 is also the receptor of SARS-Cov.According to the structural analysis of COVID-19 spike glycoprotein using the cryo-electron microscopy,the dissociation constantKDof COVID-19 to its receptor ACE2 is 15 nM[4],indicating an affinity around 10 to 20 folds higher than that of SARS-Cov whoseKDis 325.8 nM.Therefore,studies using single cell profiles based on ACE2 might catch some hints or even reliable clues for further medical research like drug or vaccine development.Based on this knowledge,single cell bioinformatics studies in COVID-19 have already generated several meaningful discoveries.

3.COVlD-19 Research Using Single Cell Transcriptome Profiles

3.1.Overview of ACE2 Expression throughout Human Body

Although the previous study has detected the expression of ACE2 in different human tissues by real-time polymerase chain reaction (PCR)[5],the exact cell types,which express ACE2 and thus might be more vulnerable to the infection of COVID-19,could not be revealed by such kind of techniques.To overcome this gap,Zouet al.[6]have analyzed published scRNA-seq datasets for the expression of ACE2,and thus have generated a whole-body map for the infection risk of COVID-19 in different cell types in humans.According to their results,the respiratory track,heart,esophagus,ileum,kidney,and bladder are vulnerable to COVID-19 due to relatively high expression of ACE2 in respiratory epithelial cells,myocardial cells,esophagus epithelial cells,ileal epithelial cells,kidney proximal tubule cells,and urothelial cells,respectively.

This study covers most of the systems with the major organs in the human body and reveals a rough overview of the vulnerability to the infection of COVID-19,however,this is merely a description of the vulnerability itself,with little mechanism behind it.In the following text,more studies about COVID-19 in different organs and tissues will be reviewed,in which more detailed findings in aspects of the injure that is caused to patients' health,the possible route of propagation,and the putative mechanisms are discussed.

3.2.Further Exploration of ACE2 Expression in Different Organs or Tissues

A.Lung

Zhaoet al.[7]have analyzed the single cell transcription profiles of lungs from seven normal adult human donors,finding that only a small proportion of type II alveolar (AT2) cells exhibit the expression of ACE2,which is in concordance with the whole-body vulnerability study mentioned above[6].For the further mechanism of the infection in these cells,Gene Ontology (GO) has been performed,showing that the viral process related GO terms,including the general positive regulation of the viral process as well as viral replication and assembly,are highlighted in these ACE2-positive AT2 cells,which might be assistance for COVID-19 to exploit the functional machinery in these AT2 cells for the viral proliferation,further infection and spread into other parts of the human body.Some severe infection cases could even induce sepsis and acute respiratory syndrome,causing a fatality rate of 15% approximately.

The fortes of single cell analysis in resolution and specification are prominent in these results.Interestingly,another conclusion in this study seems to be even more attractive to public attention:Males might be more vulnerable than females due to a higher ratio of ACE2-expressing cells.Considering the small number of individuals (two men and six women) used in the study,this may need more evidence to finally draw a solid conclusion.

B.Respiratory Tract

The gate of the respiratory system as it is,the upper respiratory tract is the first to bear the brunt during the virus making its way into lungs.Wuet al.[8]have investigated the ACE2 expression of different parts of the respiratory tract including nasal,bronchial,turbinate,and lungs.The scRNA-seq data analysis indicates that the nasal epithelial cells,in which the ACE2-expressing cell population is comparable to AT2,could be a putative host of COVID-19,which is consistent with the PCR result in which the concentration of the virus is higher in the nasal-swab than that in the throat-swab.

C.Digestive Tract

As COVID-19 patients exhibit diarrhea and nausea along with typical symptoms of pneumonia,Zhanget al.[9]have analyzed the expression of ACE2 in the scRNA-seq profiles of the human digestive system,finding that the expression of ACE2 is at a relatively high level in the absorptive enterocytes of ileum,upper and stratified epithelial cells of the esophagus,as well as the enterocytes in the colon,indicating that the COVID-19 virus could also invade the human body through the digestive tract.

D.Testis and Kidney

Since COVID-19 patients exhibit kidney damage after infection,the expression of ACE2 in the kidney has been investigated,showing that renal tubular cells have ACE2 expressed at a relatively high level[10].The authors of the same study also have examined the expression of ACE2 in the testis,finding a high level of expression in Leydig cells and seminiferous duct cells,which indicates potential damage to fertility in patients[10].These results imply that special attention should be paid to the risk of testicular lesions in patients during clinical work,and also some follow-up might be needed.

4.Discussion

Among all kinds of studies of COVID-19,scRNA-seq related studies are highly conspicuous due to its detailed resolution to shed light on the possible mechanism of the disease and potential targets for the development of medical treatment.The results of these studies also inform not only medical staff,but also the public of the possible route of infection and propagation,the injure that the infection could do to the human body,as well as the potential sequelae of prognosis.

One of the major drawbacks of these studies is that the samples are too small.There is still no scRNA-seq data from COVID-19 patients,let alone collecting and running scRNA-seq with samples from patients at different stages of the infection during this hectic time fighting with the pestilence.However,still,we have numerous publicly available datasets from databases to feed the data analyzing workflows in the studies mentioned above and to draw conclusions that could be more solid.Take the research using the respiratory tract data[7]for instance,the sample of eight individuals is far less from drawing a statistically significant conclusion,thus causing the overinterpretation of the sample and the analysis results to some extent.

Furthermore,the data analysis should not be restricted to the commonly-used workflow for the single cell data analysis,like what was used in the papers mentioned above.At the moment,the mining of public single cell transcriptome profiling data is obviously informative,however,most of them stop at some preliminary results and the interpretations are restricted to the canonical paradigm which is mainly the identification of vulnerable cell types.Actually,there are plenty of data analyzing tools dedicated to single cell RNA profiling far more than what has been adopted in existing research,therefore,besides cell type identification,functional representation and interpretations are also feasible[11],[12].Further studies that incorporate more functional analysis for the thorough understanding of the mechanism of the disease would be undoubtfully more helpful for humans to win the battle to COVID-19 as well as other rampant infectious epidemics.

Disclosures

The authors declare no conflicts of interest.