Database bias and the identification of protein coding sequences
- PMID: 3677996
- DOI: 10.1089/dna.1987.6.493
Database bias and the identification of protein coding sequences
Abstract
A simple quantitative test for the probability that an open reading frame actually codes for a protein has been described by Tramontano and Macchiato (1986). However, their test is only valid for the special case in which both coding and noncoding sequences are represented equally. We present a generalized adaptation of their method that uses estimates for the relative proportions of coding and noncoding sequences to provide a more accurate prediction.
Similar articles
-
[Analysis, identification and correction of some errors of model refseqs appeared in NCBI Human Gene Database by in silico cloning and experimental verification of novel human genes].Yi Chuan Xue Bao. 2004 May;31(5):431-43. Yi Chuan Xue Bao. 2004. PMID: 15478601 Chinese.
-
Comparison of DNA sequences with protein sequences.Genomics. 1997 Nov 15;46(1):24-36. doi: 10.1006/geno.1997.4995. Genomics. 1997. PMID: 9403055
-
Overlapping genetic codes for overlapping frameshifted genes in Testudines, and Lepidochelys olivacea as special case.Comput Biol Chem. 2012 Dec;41:18-34. doi: 10.1016/j.compbiolchem.2012.08.002. Epub 2012 Aug 14. Comput Biol Chem. 2012. PMID: 23137449
-
Molecular linguistics: extracting information from gene and protein sequences.Proc Natl Acad Sci U S A. 1997 May 27;94(11):5506-7. doi: 10.1073/pnas.94.11.5506. Proc Natl Acad Sci U S A. 1997. PMID: 9159100 Free PMC article. Review. No abstract available.
-
Comprehensive, human cellular protein databases and their implication for the study of genome organization and function.FEBS Lett. 1989 Feb 27;244(2):247-54. doi: 10.1016/0014-5793(89)80538-1. FEBS Lett. 1989. PMID: 2646149 Review.
Cited by
-
Assessment of protein coding measures.Nucleic Acids Res. 1992 Dec 25;20(24):6441-50. doi: 10.1093/nar/20.24.6441. Nucleic Acids Res. 1992. PMID: 1480466 Free PMC article. Review.