Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Sep;16(9):1182-90.
doi: 10.1101/gr.4565806. Epub 2006 Aug 10.

An initial map of insertion and deletion (INDEL) variation in the human genome

Affiliations

An initial map of insertion and deletion (INDEL) variation in the human genome

Ryan E Mills et al. Genome Res. 2006 Sep.

Abstract

Although many studies have been conducted to identify single nucleotide polymorphisms (SNPs) in humans, few studies have been conducted to identify alternative forms of natural genetic variation, such as insertion and deletion (INDEL) polymorphisms. In this report, we describe an initial map of human INDEL variation that contains 415,436 unique INDEL polymorphisms. These INDELs were identified with a computational approach using DNA re-sequencing traces that originally were generated for SNP discovery projects. They range from 1 bp to 9989 bp in length and are split almost equally between insertions and deletions, relative to the chimpanzee genome sequence. Five major classes of INDELs were identified, including (1) insertions and deletions of single-base pairs, (2) monomeric base pair expansions, (3) multi-base pair expansions of 2-15 bp repeat units, (4) transposon insertions, and (5) INDELs containing random DNA sequences. Our INDELs are distributed throughout the human genome with an average density of one INDEL per 7.2 kb of DNA. Variation hotspots were identified with up to 48-fold regional increases in INDEL and/or SNP variation compared with the chromosomal averages for the same chromosomes. Over 148,000 INDELs (35.7%) were identified within known genes, and 5542 of these INDELs were located in the promoters and exons of genes, where gene function would be expected to be influenced the greatest. All INDELs in this study have been deposited into dbSNP and have been integrated into maps of human genetic variation that are available to the research community.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
PCR validation study using the TSC panel. A PCR assay for INDEL detection is shown. (A) Arrows represent PCR primers. If a given segment of DNA is present (black box), a long PCR product (L) is produced. If it is absent, a short PCR product (S) is produced. (B) A PCR validation assay is shown for INDEL 647,421 located on chromosome 22 (also listed in Table 3 and Supplemental Table 2). The reference genome sequence indicated the presence of a 24-bp repeat of (AC)n at the coordinates given (Table 3). A trace from the TSC collection indicated that at least one individual from the Coriell panel lacked this DNA segment. Both alleles were identified in the Coriell panel of 24 individuals (Collins et al. 1999). Small (<10 bp) INDELs were located within predicted restriction sites and the PCR products were digested with an appropriate restriction enzyme to detect the INDEL (not shown).
Figure 2.
Figure 2.
Genomic distribution of INDELs. The number of INDELs is plotted for each chromosome. The four major classes of INDELs shown in Table 2 are plotted (Single, single-base pair INDEL; Exp, single- and multi-base expansions; Other, other; Trans, transposons). Note that the INDELs generally are distributed throughout the genome according to the amount of DNA that is present on each chromosome. Chromosomes 4, 5, 8, 14, 18, 20, X, and Y are the exceptions. Chromosome 20 has more INDELs than average due to the inclusion of WCS traces from chromosome 20 in our experiments. The remaining exceptions had higher levels of trace coverage in the WGS and/or TSC trace sets.
Figure 3.
Figure 3.
Fine-scale maps of chromosomal INDEL and SNP variation. INDEL and SNP variation graphs are shown for 200 kb bins across chromosomes 1, 3, and 18. The number of traces mapping to each bin and the number of trace bases in each bin also are charted. Chromosome 1 has several large variation hotspots in which the INDEL and/or SNP variation levels are elevated. Chromosome 3 has one large variation hotspot along with smaller hotspots. Chromosome 18 lacks large hotspots. The color key is indicated in the top panel. Similar graphs were generated for all human chromosomes and are found in Supplemental Tables chr1–chrY.

Similar articles

Cited by

References

    1. Altshuler D., Pollara V.J., Cowles C.R., Van Etten W.J., Baldwin J., Linton L., Lander E.S., Pollara V.J., Cowles C.R., Van Etten W.J., Baldwin J., Linton L., Lander E.S., Cowles C.R., Van Etten W.J., Baldwin J., Linton L., Lander E.S., Van Etten W.J., Baldwin J., Linton L., Lander E.S., Baldwin J., Linton L., Lander E.S., Linton L., Lander E.S., Lander E.S. An SNP map of the human genome generated by reduced representation shotgun sequencing. Nature. 2000;407:513–516. - PubMed
    1. Bailey J.A., Gu Z., Clark R.A., Reinert K., Samonte R.V., Schwartz S., Adams M.D., Myers E.W., Li P.W., Eichler E.E., Gu Z., Clark R.A., Reinert K., Samonte R.V., Schwartz S., Adams M.D., Myers E.W., Li P.W., Eichler E.E., Clark R.A., Reinert K., Samonte R.V., Schwartz S., Adams M.D., Myers E.W., Li P.W., Eichler E.E., Reinert K., Samonte R.V., Schwartz S., Adams M.D., Myers E.W., Li P.W., Eichler E.E., Samonte R.V., Schwartz S., Adams M.D., Myers E.W., Li P.W., Eichler E.E., Schwartz S., Adams M.D., Myers E.W., Li P.W., Eichler E.E., Adams M.D., Myers E.W., Li P.W., Eichler E.E., Myers E.W., Li P.W., Eichler E.E., Li P.W., Eichler E.E., Eichler E.E. Recent segmental duplications in the human genome. Science. 2002;297:1003–1007. - PubMed
    1. Bennett E.A., Coleman L.E., Tsui C., Pittard W.S., Devine S.E., Coleman L.E., Tsui C., Pittard W.S., Devine S.E., Tsui C., Pittard W.S., Devine S.E., Pittard W.S., Devine S.E., Devine S.E. Natural genetic variation caused by transposable elements in humans. Genetics. 2004;168:933–951. - PMC - PubMed
    1. Berger J., Suzuki T., Senti K.A., Stubbs J., Schaffner G., Dickson B.J., Suzuki T., Senti K.A., Stubbs J., Schaffner G., Dickson B.J., Senti K.A., Stubbs J., Schaffner G., Dickson B.J., Stubbs J., Schaffner G., Dickson B.J., Schaffner G., Dickson B.J., Dickson B.J. Genetic mapping with SNP markers in Drosophila . Nat. Genet. 2001;29:475–481. - PubMed
    1. Bhangale T.R., Rieder M.J., Livingston R.J., Nickerson D.A., Rieder M.J., Livingston R.J., Nickerson D.A., Livingston R.J., Nickerson D.A., Nickerson D.A. Comprehensive identification and characterization of diallelic insertion-deletion polymorphisms in 330 human candidate genes. Hum. Mol. Genet. 2005;14:59–69. - PubMed

Publication types

LinkOut - more resources