Abstract
Comparisons of DNA polymorphism within species to divergence between species enables the discovery of molecular adaptation in evolutionarily constrained genes as well as the differentiation of weak from strong purifying selection1,2,3,4. The extent to which weak negative and positive darwinian selection have driven the molecular evolution of different species varies greatly5,6,7,8,9,10,11,12,13,14,15,16, with some species, such as Drosophila melanogaster, showing strong evidence of pervasive positive selection6,7,8,9, and others, such as the selfing weed Arabidopsis thaliana, showing an excess of deleterious variation within local populations9,10. Here we contrast patterns of coding sequence polymorphism identified by direct sequencing of 39 humans for over 11,000 genes to divergence between humans and chimpanzees, and find strong evidence that natural selection has shaped the recent molecular evolution of our species. Our analysis discovered 304 (9.0%) out of 3,377 potentially informative loci showing evidence of rapid amino acid evolution. Furthermore, 813 (13.5%) out of 6,033 potentially informative loci show a paucity of amino acid differences between humans and chimpanzees, indicating weak negative selection and/or balancing selection operating on mutations at these loci. We find that the distribution of negatively and positively selected genes varies greatly among biological processes and molecular functions, and that some classes, such as transcription factors, show an excess of rapidly evolving genes, whereas others, such as cytoskeletal proteins, show an excess of genes with extensive amino acid polymorphism within humans and yet little amino acid divergence between humans and chimpanzees.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Kimura, M. The Neutral Theory of Molecular Evolution (Cambridge Univ. Press, Cambridge, 1983)
Hudson, R. R., Kreitman, M. & Aguadé, M. A test of neutral molecular evolution based on nucleotide data. Genetics 116, 153–159 (1987)
McDonald, J. H. & Kreitman, M. Adaptive protein evolution at the Adh locus in Drosophila. Nature 351, 652–654 (1991)
Sawyer, S. A. & Hartl, D. L. Population genetics of polymorphism and divergence. Genetics 132, 1161–1176 (1992)
Eyre-Walker, A. & Keightley, P. D. High genomic deleterious mutation rates in hominids. Nature 397, 344–347 (1999)
Fay, J. C., Wyckoff, G. J. & Wu, C. I. Testing the neutral theory of molecular evolution with genomic data from Drosophila. Nature 415, 1024–1026 (2002)
Sawyer, S. A., Kulathinal, R. J., Bustamante, C. D. & Hartl, D. L. Bayesian analysis suggests that most amino acid replacements in Drosophila are driven by positive selection. J. Mol. Evol. 57 (suppl. 1), S154–S164 (2003)
Smith, N. G. & Eyre-Walker, A. Adaptive protein evolution in Drosophila. Nature 415, 1022–1024 (2002)
Bustamante, C. D. et al. The cost of inbreeding in Arabidopsis. Nature 416, 531–534 (2002)
Nordborg, M. et al. The pattern of polymorphism in Arabidopsis thaliana. PLoS Biol. 3, e196 (2005)
Halushka, M. K. et al. Patterns of single-nucleotide polymorphisms in candidate genes for blood-pressure homeostasis. Nature Genet. 22, 239–247 (1999)
Cargill, M. et al. Characterization of single-nucleotide polymorphisms in coding regions of human genes. Nature Genet. 22, 231–238 (1999)
Stephens, J. C. et al. Haplotype variation and linkage disequilibrium in 313 human genes. Science 293, 489–493 (2001)
Livingston, R. J. et al. Pattern of sequence variation across 213 environmental response genes. Genome Res. 14, 1821–1831 (2004)
Sunyaev, S. et al. Prediction of deleterious human alleles. Hum. Mol. Genet. 10, 591–597 (2001)
Williamson, S. et al. Simultaneous inference of selection and population growth from patterns of variation in the human genome. Proc. Natl Acad. Sci. USA 102, 7882–7887 (2005)
Barrier, M., Bustamante, C. D., Yu, J. & Purugganan, M. D. Selection on rapidly evolving proteins in the Arabidopsis genome. Genetics 163, 723–733 (2003)
Nielsen, R. et al. A scan for positively selected genes in the genomes of humans and chimpanzees. PLoS Biol. 6, e170 (2005)
Manunta, P. et al. Alpha-adducin polymorphisms and renal sodium handling in essential hypertensive patients. Kidney Int. 53, 1471–1478 (1998)
Morrison, A. C., Bray, M. S., Folsom, A. R. & Boerwinkle, E. ADD1 460W allele associated with cardiovascular disease in hypertensive individuals. Hypertension 39, 1053–1057 (2002)
Kent, W. J. BLAT–the BLAST-like alignment tool. Genome Res. 12, 656–664 (2002)
Weinreich, D. M. & Rand, D. M. Contrasting patterns of nonneutral evolution in proteins encoded in nuclear and mitochondrial genomes. Genetics 156, 385–399 (2000)
Williamson, S., Fledel-Alon, A. & Bustamante, C. D. Population genetics of polymorphism and divergence for diploid selection models with arbitrary dominance. Genetics 168, 463–475 (2004)
Ioerger, T. R., Clark, A. G. & Kao, T. H. Polymorphism at the self-incompatibility locus in Solanaceae predates speciation. Proc. Natl Acad. Sci. USA 87, 9732–9735 (1990)
Acknowledgements
We thank K. Thornton and B. Payseur for suggestions during the analysis. Some of the analysis was supported by NIH grants to C.D.B., R.N. and A.G.C. We also acknowledge the help of J. Pillardy and the Cornell University Theory Center Computational Biology Service Unit. Author Contributions S.G., D.M.T., D.C., T.J.W., J.J.S., M.D.A. and M.C. conceived, designed and performed the experiments. C.D.B., A.F.-A., A.G.C., S.W., R.N. and M.J.H. analysed the data.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
Accession numbers for the SNP markers analysed in this study are dbSNP numbers ss48401226–ss48429818 and ss48429821–ss48431291, submitted under the handle APPLERA_GI. Reprints and permissions information is available at npg.nature.com/reprintsandpermissions. The authors declare no competing financial interests.
Supplementary information
Supplementary Table 1
A spreadsheet file containing the Mann-Whitney and Z-test results for all Panther classification of molecular function and biological process. (XLS 139 kb)
Supplementary Data 2
An text file with one line per gene giving the cell entries in the McDonald-Kreitman tables, estimated selection intensities and confidence intervals, and well as posterior P-values. (TXT 1086 kb)
Supplementary Methods
A detailed description of how the Single Nucleotide Polymorphisms we analyze in this paper were discovered and validated. Also includes details on Bioinformatic controls and quality checks. (DOC 131 kb)
Supplementary Data 1
Provides a detailed mathematical description of the statistical method we employ in this paper as well as details of coalescent simulations used to gauge robustness to demographic misspecification. (PDF 798 kb)
Supplementary Figure 1
Relationship between scaled McDonald–Kreitman cell entries and posterior mean of the selection coefficient γ for all genes in the INS data set. (PDF 1614 kb)
Supplementary Figure 2
Scatterplot of log-odds posterior of negative selection . (PDF 45 kb)
Supplementary Figure Legends
Text to accompany the above Supplementary Figures. (DOC 40 kb)
Rights and permissions
About this article
Cite this article
Bustamante, C., Fledel-Alon, A., Williamson, S. et al. Natural selection on protein-coding genes in the human genome. Nature 437, 1153–1157 (2005). https://doi.org/10.1038/nature04240
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1038/nature04240
This article is cited by
-
Developmental mechanisms underlying the evolution of human cortical circuits
Nature Reviews Neuroscience (2023)
-
Evolutionary mechanisms underlying the diversification of nuclear factor of activated T cells across vertebrates
Scientific Reports (2023)
-
dN/dS-H, a New Test to Distinguish Different Selection Modes in Protein Evolution and Cancer Evolution
Journal of Molecular Evolution (2022)
-
A predictive model for vertebrate bone identification from collagen using proteomic mass spectrometry
Scientific Reports (2021)
-
Molecular footprints of selection effects and whole genome duplication (WGD) events in three blueberry species: detected by transcriptome dataset
BMC Plant Biology (2020)