Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2011 Nov 8:8:90.
doi: 10.1186/1742-4690-8-90.

Identification, characterization, and comparative genomic distribution of the HERV-K (HML-2) group of human endogenous retroviruses

Affiliations
Comparative Study

Identification, characterization, and comparative genomic distribution of the HERV-K (HML-2) group of human endogenous retroviruses

Ravi P Subramanian et al. Retrovirology. .

Abstract

Background: Integration of retroviral DNA into a germ cell may lead to a provirus that is transmitted vertically to that host's offspring as an endogenous retrovirus (ERV). In humans, ERVs (HERVs) comprise about 8% of the genome, the vast majority of which are truncated and/or highly mutated and no longer encode functional genes. The most recently active retroviruses that integrated into the human germ line are members of the Betaretrovirus-like HERV-K (HML-2) group, many of which contain intact open reading frames (ORFs) in some or all genes, sometimes encoding functional proteins that are expressed in various tissues. Interestingly, this expression is upregulated in many tumors ranging from breast and ovarian tissues to lymphomas and melanomas, as well as schizophrenia, rheumatoid arthritis, and other disorders.

Results: No study to date has characterized all HML-2 elements in the genome, an essential step towards determining a possible functional role of HML-2 expression in disease. We present here the most comprehensive and accurate catalog of all full-length and partial HML-2 proviruses, as well as solo LTR elements, within the published human genome to date. Furthermore, we provide evidence for preferential maintenance of proviruses and solo LTR elements on gene-rich chromosomes of the human genome and in proximity to gene regions.

Conclusions: Our analysis has found and corrected several errors in the annotation of HML-2 elements in the human genome, including mislabeling of a newly identified group called HML-11. HML-elements have been implicated in a wide array of diseases, and characterization of these elements will play a fundamental role to understand the relationship between endogenous retrovirus expression and disease.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Cartoon schematic of HML-2 proviruses in the human genome. (A) A cartoon depicting the layout of the prototypical HML-2 retrovirus, including gag, pro, pol, and env gene positions. Splice sites of env, np9, and rec genes are also shown, with a faint gray band indicating the type 1 deletion region. Proviruses of the LTR5Hs (B), LTR5A (C), and LTR5B (D) groups are depicted and color-coded according to type. Type 1 proviruses colored in grey (with LTRs filled grey), type 2 colored in black (with LTRs filled black), and unclassified having open LTRs colored grey. Insertions and deletions < 3 bases are depicted with blue and red flags respectively. Larger insertions of retroelements are labeled according to type of element inserted, and large deletions are shown with dashed lines corresponding to missing sequence. Stop codons are indicated with a grey flag. Daggers indicate human-specific proviruses, with double daggers indicating polymorphic proviruses.
Figure 2
Figure 2
Phylogeny of provirus LTR sequences. Bayesian inference trees were generated using 5' and 3' LTRs of HML-2 provirus elements in the human genome. LTR sequences of less than 250 bases in length were not included, as they limited capacity to detect phylogenetic relationships among LTR sequences. Sequences are color-coded according to distinctive LTR subgroup features (see Methods). LTR5Hs sequences are shown in (A), with 5B and 5A sequences added to serve as a reference. LTR5A (B) and LTR5B (C) are similarly displayed. Open diamonds indicate recombinant proviruses, and duplications are grouped using colored bars. Posterior probability values > 70 are shown for the best tree rooted on 5A and 5B (A), 5Hs (B), and 5B and 5Hs (C).
Figure 3
Figure 3
Phylogeny of HML endogenous elements. A Bayesian inference tree of the pol gene from prototypical members of HML 1-10 families (indicated by "REF") along with exogenous betaretroviruses MMTV, MPMV, and JSRV was generated to characterize proviruses identified through our BLAT search (black). Sequences were colored according to LTR5 subgroup and annotated with filled diamonds for type 1 proviruses and open diamonds for proviruses of undetermined type. Colored sequences without diamonds represent type 2 proviruses. Posterior probability values > 70 are shown for the tree rooted on the exogenous betaretrovirus sequences.
Figure 4
Figure 4
Phylogenetic analysis of gag and env genes. Bayesian inference trees were generated for the first ~1800 bp of gag (A), as well as the SU portion of env (B). Trees were rooted using the 17p13.1 provirus sequence as an out-group, with posterior probabilities above 80 shown. Sequences are color-coded according to LTR group, with type 1 proviruses indicated with filled diamonds, and undetermined types with open diamonds. All other proviruses shown are type 2.
Figure 5
Figure 5
Age of HML-2 LTRs. LTRs from 5Hs, 5A, and 5B subgroups were used to determine distance measurements as a function of time of integration. Ages of "Solo LTR" elements were determined as described in Methods, while "Provirus" element ages were determined by 5' to 3' distance measurement. "Human Specific 5Hs" refers to solo LTRs that are only found in the human genome, and whose age was calculated using the solo LTR age calculation method.
Figure 6
Figure 6
Relationships between chromosome size, solo LTR, provirus or RefSeq gene frequency. Binomial regression analysis was performed using chromosome size (A), solo LTR frequency (B), or RefSeq gene frequency (C) as a predictor of provirus frequency; or predicting solo LTR frequency using chromosome size (D), provirus frequency (E), or RefSeq gene frequency (F); or predicting RefSeq gene frequency using chromosome size (G), solo LTR frequency (H), or provirus frequency (I). The bold line represents mean correlation, with 95% confidence intervals shown with dashed lines. P-values are shown for each plot.

Similar articles

Cited by

References

    1. Boeke JD, Stoye JP. In: Retroviruses. Coffin JM, Hughes SH, Varmus HE, editor. Cold Spring Harbor Laboratory Press; 1997. Retrotransposons, Endogenous Retroviruses, and the Evolution of Retroelements; pp. 343–436. - PubMed
    1. Bock M, Stoye JP. Endogenous retroviruses and the human germline. Curr Opin Genet Dev. 2000;10:651–655. doi: 10.1016/S0959-437X(00)00138-6. - DOI - PubMed
    1. IHGSC. A physical map of the human genome. Nature. 2001;409:934–941. doi: 10.1038/35057157. - DOI - PubMed
    1. Dewannieux M, Blaise S, Heidmann T. Identification of a functional envelope protein from the HERV-K family of human endogenous retroviruses. J Virol. 2005;79:15573–15577. doi: 10.1128/JVI.79.24.15573-15577.2005. - DOI - PMC - PubMed
    1. Reus K, Mayer J, Sauter M, Zischler H, Muller-Lantzsch N, Meese E. HERV-K(OLD): ancestor sequences of the human endogenous retrovirus family HERV-K(HML-2) J Virol. 2001;75:8917–8926. doi: 10.1128/JVI.75.19.8917-8926.2001. - DOI - PMC - PubMed

Publication types

Associated data

LinkOut - more resources