Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2003 May 24;361(9371):1779-85.
doi: 10.1016/s0140-6736(03)13414-9.

Comparative full-length genome sequence analysis of 14 SARS coronavirus isolates and common mutations associated with putative origins of infection

Affiliations
Comparative Study

Comparative full-length genome sequence analysis of 14 SARS coronavirus isolates and common mutations associated with putative origins of infection

Yi Jun Ruan et al. Lancet. .

Erratum in

  • Lancet. 2003 May 24;361(9371):1832

Abstract

Background: The cause of severe acute respiratory syndrome (SARS) has been identified as a new coronavirus. Whole genome sequence analysis of various isolates might provide an indication of potential strain differences of this new virus. Moreover, mutation analysis will help to develop effective vaccines.

Methods: We sequenced the entire SARS viral genome of cultured isolates from the index case (SIN2500) presenting in Singapore, from three primary contacts (SIN2774, SIN2748, and SIN2677), and one secondary contact (SIN2679). These sequences were compared with the isolates from Canada (TOR2), Hong Kong (CUHK-W1 and HKU39849), Hanoi (URBANI), Guangzhou (GZ01), and Beijing (BJ01, BJ02, BJ03, BJ04).

Findings: We identified 129 sequence variations among the 14 isolates, with 16 recurrent variant sequences. Common variant sequences at four loci define two distinct genotypes of the SARS virus. One genotype was linked with infections originating in Hotel M in Hong Kong, the second contained isolates from Hong Kong, Guangzhou, and Beijing with no association with Hotel M (p<0.0001). Moreover, other common sequence variants further distinguished the geographical origins of the isolates, especially between Singapore and Beijing.

Interpretation: Despite the recent onset of the SARS epidemic, genetic signatures are emerging that partition the worldwide SARS viral isolates into groups on the basis of contact source history and geography. These signatures can be used to trace sources of infection. In addition, a common variant associated with a non-conservative aminoacid change in the S1 region of the spike protein, suggests that immunological pressures might be starting to influence the evolution of the SARS virus in human populations.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Genome structure of SARS-CoV NSP=Non-structural proteins. S=spike protein. E=small envelop protein. N=nucleocapsid. M=Membrane protein. MHV=murine hepatitis virus. MD=metal binding. 3CL-PRO=3C-like proteinase. The top scale shows the approximate nucleotide position along SARS-CoV genome (golden bar) determined from the Singapore isolate SIN2500. The arrows on top of the genome bar map the locations where nucleotide sequence variations, present in two or more isolates, were detected. The two open triangles point to the locations where the two multi-nucleotide deletions occurred. The SARS genome is predicted to encode 23 putative mature proteins (blue bars). The arrows on top of the protein bars indicate the location of aminoacid changes.
Figure 2
Figure 2
Homology of SARS-CoV genome sequence (SIN2500) to other coronaviruses A heat map created by comparing overlapping fragments of the SARS-CoV genome sequence against a database of coronavirus sequences. The SARS fragments are plotted along the horizontal axis in the order they appear in the genome, and the other coronaviruses are plotted vertically. The brightness of a pixel corresponds to the strength of the match between a SARS fragment and a coronavirus genome; the smaller the p value, the lighter the pixel.
Figure 3
Figure 3
Sequence comparisons of the 14 available SARS-CoV genomes Only those variant sequences (red) that were present in at least two independent sequences are shown. See webappendix2 for complete list of all variant nucleotides. The frequency of appearance of each variant nucleotide is presented. Highlighted in yellow are four sequence positions that define two distinct genotypic variants of SARS-CoV. The position of each nucleotide is based upon the URBANI SARS-CoV sequence and the corresponding encoded protein or uncharacterised open reading frames (ORF) are indicated. The effect, if any, on the encoded aminoacid (AA) is described. Nucleotides that are missing (-) or ambiguous (N) in the genome sequences are indicated and the length of each genome sequence is given. The patients’countries of origin and association of patient with Hotel M are provided for all viral genome sequences, and the relative order of transmission is provided for the Singapore cases.
Figure 4
Figure 4
Molecular relations between the 14 SARS-CoV Isolates Phylogenetic trees obtained by applying PAUP* to complete genome sequences of all 14 SARS-CoV samples. The tree was built with only sequence variants that occurred at least twice.
Figure 5
Figure 5
Clinical relations between the 14 SARS-CoV isolates The routes of transmissions are shown for the 14 viral isolates that have been sequenced, indicated with solid boxes. Solid arrows=routes of transmission from Hotel M are known be direct. Broken arrow=direct relation information is not available. Patient A is the Hong Kong index patient who travelled from Guangdong to Hong Kong and transmitted SARS to others at Hotel M, who then travelled to Singapore, Canada, and Vietnam, thus becoming index cases in those countries. The routes of transmission from Guangdong are unknown and shown as dotted arrows. *Dashed boxes are routes of transmission that are uncertain.

Comment in

Similar articles

Cited by

References

    1. WHO Cumulative number of reported probable cases of severe acute respiratory syndrome (SARS) www.who.int/csr/sarscountry/2003_04_24/en/ (accessed April 19, 2003).
    1. Drosten C, Gunther S, Preiser W. Identification of a novel coronavirus in patients with severe acute respiratory syndrome. N Engl J Med. 2003 (published online April 10, 10.1056/NEJMoa030781). - DOI - PubMed
    1. Ksiazek TG, Erdman D, Goldsmith CS. A novel coronavirus associated with severe acute respiratory syndrome. N Engl J Med. 2003 (published online April 10, DOI: 10.1056/NEJMoa030747). - DOI - PubMed
    1. Peiris JSM, Lai ST, Poon LLM. Coronavirus as a possible cause of severe acute respiratory syndrome. Lancet. 2003;361:1319–1325. - PMC - PubMed
    1. Kuo L, Godeke GJ, Raamsman MJ, Masters PS, Rottier PJ. Retargeting of coronavirus by substitution of the spike glycoprotein ectodomain: crossing the host cell species barrier. J Virol. 2000;74:1393–1406. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources