Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jul 28;12(1):68.
doi: 10.1186/s13073-020-00763-0.

Characterisation of the transcriptome and proteome of SARS-CoV-2 reveals a cell passage induced in-frame deletion of the furin-like cleavage site from the spike glycoprotein

Affiliations

Characterisation of the transcriptome and proteome of SARS-CoV-2 reveals a cell passage induced in-frame deletion of the furin-like cleavage site from the spike glycoprotein

Andrew D Davidson et al. Genome Med. .

Abstract

Background: SARS-CoV-2 is a recently emerged respiratory pathogen that has significantly impacted global human health. We wanted to rapidly characterise the transcriptomic, proteomic and phosphoproteomic landscape of this novel coronavirus to provide a fundamental description of the virus's genomic and proteomic potential.

Methods: We used direct RNA sequencing to determine the transcriptome of SARS-CoV-2 grown in Vero E6 cells which is widely used to propagate the novel coronavirus. The viral transcriptome was analysed using a recently developed ORF-centric pipeline. Allied to this, we used tandem mass spectrometry to investigate the proteome and phosphoproteome of the same virally infected cells.

Results: Our integrated analysis revealed that the viral transcripts (i.e. subgenomic mRNAs) generally fitted the expected transcription model for coronaviruses. Importantly, a 24 nt in-frame deletion was detected in over half of the subgenomic mRNAs encoding the spike (S) glycoprotein and was predicted to remove a proposed furin cleavage site from the S glycoprotein. Tandem mass spectrometry identified over 500 viral peptides and 44 phosphopeptides in virus-infected cells, covering almost all proteins predicted to be encoded by the SARS-CoV-2 genome, including peptides unique to the deleted variant of the S glycoprotein.

Conclusions: Detection of an apparently viable deletion in the furin cleavage site of the S glycoprotein, a leading vaccine target, shows that this and other regions of SARS-CoV-2 proteins may readily mutate. The furin site directs cleavage of the S glycoprotein into functional subunits during virus entry or exit and likely contributes strongly to the pathogenesis and zoonosis of this virus. Our data emphasises that the viral genome sequence should be carefully monitored during the growth of viral stocks for research, animal challenge models and, potentially, in clinical samples. Such variations may result in different levels of virulence, morbidity and mortality.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Overview of nanopore inferred transcriptome. a The classical transcription map of coronaviruses adapted for SARS-CoV-2. The genome is itself an mRNA which when translated gives rise to polyproteins pp1a and, upon a ribosomal frameshift, pp1ab. These polyproteins are proteolytically processed down to a range of non-structural proteins termed nsp1–16, some of which will form the viral replication-transcription complex (RTC). The RTC then generates subgenomic mRNA which canonically contains a sequence present at the 5′ end of the viral genome known as the leader sequence. The 3′-end of the leader sequence has a motif, the transcription regulatory sequence (TRS), and there are similar sequences which precede each of the functional ORFs downstream of the replicase gene (pp1ab). This TRS in the leader associates with one of the TRS regions present adjacent to each of the other functional ORFS and this mediates discontinuous transcription between the two during minus-strand RNA synthesis. These minus-strand RNA molecules are used as templates to generate positive sense mRNA, and in this manner, the remaining ORFs on the viral genome are placed 5′ most on the resulting subgenomic mRNAs and are subsequently translated. Orange boxes represent structural proteins and yellow boxes represent accessory proteins. b The total read depth across the viral genome for all reads; the maximum read depth was 511,129. c The structure of only the dominant transcript that codes for each of the identified ORFS. Only transcripts that start inside the leader TRS sequence are considered here. The rectangles represent mapped nucleotides and the arrowed lines represent regions of the genome that are not transcribed during the generation of mRNAs. To the right is noted the 5′ most ORF encoded in the transcript; in parenthesis we note how many individual transcripts were observed. Transcrits coding for proteins we subsequently detected by MS/MS are coloured in green
Fig. 2
Fig. 2
Deletions within the viral mRNAs encoding the S glycoprotein and N protein. a The read depth over the region deleted in the S glycoprotein together with information on the sequence in the region and the translation in all three frames. b A clustal alignment of four proteins over this region, wild type SARS-CoV, wild type SARS-CoV-2, the artificially deleted version of the wild type SARS-CoV-2 S glycoprotein as reported in Walls et al. [38] and finally the predicted sequence of the deleted protein described here. Highlighted in yellow is the sequence of the unique peptide generated by chymotrypsin digest of the protein which was identified by tandem mass spectrometry. The positions of predicted protease cleavage sites [23] at the S1/S2 boundary are shown. c A proposed deletion in the N protein predicted by multiple aligned transcripts and subsequently identified in trypsin digested protein samples as indicated by the unique peptide highlighted in yellow
Fig. 3
Fig. 3
A space filled model of the wild type SARS-CoV-2 S glycoprotein in a trimeric form using the sequence of a the native or b spike deletant virus, in which the aa’s 679NSPRRARSV687 have been replaced with isoleucine. The model was built using a cryo-EM structure (6VSB.pdb) of the S glycoprotein in the prefusion form (25). Each of the monomers is coloured differently. The loop containing the furin cleavage site (or the shortened loop in the deleted version in b) is indicated in red. The positions of phosphorylation sites identified by mass spectrometry and surface located were mapped on the native structure and shown in yellow in a
Fig. 4
Fig. 4
Schematic of the location of phosphorylation sites. Proteins M, N, NSP3, NSP9 and S glycoprotein are shown as we have accurate phospho-site data for these proteins. For each location, we indicate the amino acids (S, T or Y) and the amino acid numbering. The S glycoprotein is shown as S1 and S2 to illustrate where the sites would be relative to the major cleavage site on the wild type S glycoprotein
Fig. 5
Fig. 5
Modelling phosphorylation on the RNA binding domain of N protein. The positions of phosphorylation sites identified by mass spectrometry were mapped on the x-ray crystal structure of the N-terminal RNA binding domain of the N protein (aa residues 47–173) from SARS-CoV-2 (6YVO.pdb). The four monomer units in one asymmetric unit are distinctly coloured and shown as side (a, b) and top (c, d) views as ribbon (left hand figures) and space filling models (right hand figures)

Similar articles

Cited by

References

    1. Dong E, Du H, Gardner L. An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect Dis. 2020. 10.1016/S1473-3099(20)30120-1. - PMC - PubMed
    1. Kottier SA, Cavanagh D, Britton P. Experimental evidence of recombination in coronavirus infectious bronchitis virus. Virology. 1995;213:569–580. doi: 10.1006/viro.1995.0029. - DOI - PMC - PubMed
    1. Wu F, et al. A new coronavirus associated with human respiratory disease in China. Nature. 2020;579:265–269. doi: 10.1038/s41586-020-2008-3. - DOI - PMC - PubMed
    1. Zhou P, et al. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature. 2020;579:270–273. doi: 10.1038/s41586-020-2012-7. - DOI - PMC - PubMed
    1. Perlman S, Netland J. Coronaviruses post-SARS: update on replication and pathogenesis. Nat Rev Microbiol. 2009;7:439–450. doi: 10.1038/nrmicro2147. - DOI - PMC - PubMed

Publication types

Substances