Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2003 Jul 25;307(2):382-8.
doi: 10.1016/s0006-291x(03)01192-6.

ZCURVE_CoV: a new system to recognize protein coding genes in coronavirus genomes, and its applications in analyzing SARS-CoV genomes

Affiliations

ZCURVE_CoV: a new system to recognize protein coding genes in coronavirus genomes, and its applications in analyzing SARS-CoV genomes

Ling-Ling Chen et al. Biochem Biophys Res Commun. .

Abstract

A new system to recognize protein coding genes in the coronavirus genomes, specially suitable for the SARS-CoV genomes, has been proposed in this paper. Compared with some existing systems, the new program package has the merits of simplicity, high accuracy, reliability, and quickness. The system ZCURVE_CoV has been run for each of the 11 newly sequenced SARS-CoV genomes. Consequently, six genomes not annotated previously have been annotated, and some problems of previous annotations in the remaining five genomes have been pointed out and discussed. In addition to the polyprotein chain ORFs 1a and 1b and the four genes coding for the major structural proteins, spike (S), small envelop (E), membrane (M), and nuleocaspid (N), respectively, ZCURVE_CoV also predicts 5-6 putative proteins in length between 39 and 274 amino acids with unknown functions. Some single nucleotide mutations within these putative coding sequences have been detected and their biological implications are discussed. A web service is provided, by which a user can obtain the annotated result immediately by pasting the SARS-CoV genome sequences into the input window on the web site (http://tubic.tju.edu.cn/sars/). The software ZCURVE_CoV can also be downloaded freely from the web address mentioned above and run in computers under the platforms of Windows or Linux.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Distribution of the mapping points corresponding to genes, non-genes, predicted genes, and questionable ORFs for the SARS-CoV, TOR2 strain in a 3-dimensional (3-D) space. Each gene or ORF is mapped onto a point in a 9-D space. To visualize the distribution, the mapping points are projected onto the 3-D space spanned by the first three principal axes based on the principal component analysis. The first, second, and third principal vectors are denoted by the X-, Y-, and Z-axes, respectively. The fraction of the first three principal components accounts for 69.59% of the total inertia of the 9-D space. Green and orange balls represent the positive samples (genes) and negative samples (non-coding sequences), respectively. Blue balls correspond to the genes predicted by ZCURVE_CoV for the TOR2 strain, while red balls correspond to ORF 4, ORF 13, and ORF 14 annotated by Marra et al. . It is clear that the three red balls are situated at the side of non-coding sequences, indicating that ORF 4, ORF 13, and ORF 14 are very unlikely to code for proteins.
Fig. 2
Fig. 2
Nucleotide mutations of the predicted gene Sars274 based on the alignment of corresponding coding sequences in 11 complete genome sequences. A total of four point mutations are detected, of which one is a silent mutation and the other three cause amino acid changes in the putative genes. The point mutations occur at nucleotide positions 31, 302, 406, and 783, respectively. At the 31st position, G → A (TOR2) ⇒ Gly → Arg. Similarly, at the 302nd position, T (U) → A (HKU-39849) ⇒ Met → Lys; at the 406th position, A → C (BJ01) ⇒ Lys→ Gln; and at the 783rd position, A → C (BJ01), but no amino acid change.

Similar articles

Cited by

References

    1. Peiris J.S. Coronavirus as a possible cause of severe acute respiratory syndrome. Lancet. 2003;361:1319–1325. - PMC - PubMed
    1. Ksiazek T.G. A novel coronavirus associated with severe acute respiratory syndrome. N. Engl. J. Med. 2003;348:1953–1966. - PubMed
    1. Drosten C. Identification of a novel coronavirus in patients with severe acute respiratory syndrome. N. Engl. J. Med. 2003;348:1967–1976. - PubMed
    1. Tsang K.W. A cluster of cases of severe acute respiratory syndrome in Hong Kong. N. Engl. J. Med. 2003;348:1977–1985. - PubMed
    1. Lee N. A major outbreak of severe acute respiratory syndrome in Hong Kong. N. Engl. J. Med. 2003;348:1986–1994. - PubMed

Publication types

LinkOut - more resources