Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 1985 May 17;228(4701):815-22.
doi: 10.1126/science.2988123.

The LDL receptor gene: a mosaic of exons shared with different proteins

The LDL receptor gene: a mosaic of exons shared with different proteins

T C Südhof et al. Science. .

Abstract

The multifunctional nature of coated pit receptors predicts that these proteins will contain multiple domains. To establish the genetic basis for these domains (LDL) receptor. This gene is more than 45 kilobases in length and contains 18 exons, most of which correlate with functional domains previously defined at the protein level. Thirteen of the 18 exons encode protein sequences that are homologous to sequences in other proteins: five of these exons encode a sequence similar to one in the C9 component of complement; three exons encode a sequence similar to a repeat sequence in the precursor for epidermal growth factor (EGF) and in three proteins of the blood clotting system (factor IX, factor X, and protein C); and five other exons encode nonrepeated sequences that are shared only with the EGF precursor. The LDL receptor appears to be a mosaic protein built up of exons shared with different proteins, and it therefore belongs to several supergene families.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Map of the human LDL receptor gene. The gene is shown in the 5′ to 3′ orientation at the top of the diagram and is drawn to scale. Exons are denoted by filled-in areas, and introns by open areas. The regions encompassed by genomic DNA inserts in the seven bacteriophage λ and two cosmid clones are indicated at the bottom. Cleavage sites for 13 selected restriction endonucleases are shown. Asterisks denote sites that are present in the cDNA. The encircled Pvu II site is polymorphic in human populations (30). The diagonal line between exons 1 and 2 represents a gap of unknown size not present in any of the genomic clones. Additional cleavage sites for the restriction enzymes shown may be present in this gap and in intron 6 (Table 1, legend). The λ clones were isolated from 1.2 × 107 plaques of a human genomic bacteriophage λ library (31). Cos1 was isolated from 6 × 106 colonies of a human cosmid library (32). Cos26 was isolated from 0.9 × 106 colonies of a human cosmid library (33). The libraries were screened with 32P-labeled probes derived from the human LDL receptor cDNA, pLDLR-2 (4). Probes were isotopically labeled by nick translation (34) or hexanucleotide priming (35) and screening was carried out with standard procedures (34). Positive clones were plaque-purified or isolated as single colonies. Thirty fragments from the nine genomic clones were subcloned into pBR322 and characterized by restriction endonuclease digestion, Southern blotting (34), and DNA sequencing of exon-intron junctions (see Table 1). The restriction map was verified by comparing overlapping and independently isolated genomic clones and by Southern blotting analysis of genomic DNA isolated from normal individuals.
Fig. 2
Fig. 2
Nucleotide sequence of the 5′ end of the human LDL receptor gene. Nucleotide position +1 is assigned to the A of the ATG codon specifying the initiator methionine; negative numbers refer to 5′ flanking sequences. Amino acids encoded by the first exon and the position of intron 1 are indicated on the bottom line. Vertical arrows above the sequence indicate sites of transcription initiation as determined by S1 nuclease mapping (Fig. 3). Vertical arrows below the sequence indicate sites of transcription initiation as determined by primer extension (Fig. 4). Asterisks denote an apparent S1 nuclease hypersensitive site (above sequence) or a strong stop point for reverse transcriptase (below sequence). Two AT-rich regions that are located 20 to 30 nucleotides upstream of the mRNA start points are boxed. Solid horizontal arrows denote three imperfect direct repeats of 16 nucleotides each. Dashed arrows denote two imperfect inverted repeats of 14 nucleotides each. The DNA sequence was determined by a combination of the chemical (36) and enzymatic (37) methods. Two M13 subclones derived from the bacteriophage genomic clone λ34 were used as templates together with the universal primer (38) and five LDL receptor-specific oligonucleotide primers to establish the sequence by the enzymatic method. Selected regions were then sequenced again by the chemical method; 90 percent of the sequence was determined on both strands of the DNA.
Fig. 3
Fig. 3
Sites of transcription initiation in the human LDL receptor gene as determined by S1 nuclease analysis. The indicated amount of yeast tRNA (lanes 1 and 2), poly(A)+ RNA from SV40-transformed human fibroblasts (lanes 3 to 6), or poly(A)+ RNA from adult adrenal glands (lane 7) was annealed to a 5′ end-labeled with 32P, single-stranded DNA probe corresponding to nucleotides −682 to +43 of Fig. 2. The fibroblasts were grown in the presence or absence of sterols as indicated. The RNA-DNA hybrids were digested with S1 nuclease, and the resistant products were subjected to electrophoresis through a 10 percent polyacrylamide-7M urea gel and detected by autoradiography for 72 hours at −20°C. SV40-transformed fibroblasts were set up in roller bottles (3 × 106 cells per bottle) and grown under standard conditions with fetal calf serum (10 percent) for 48 hours (39). On day 2, one-half of the roller bottles were switched to medium containing 10 percent calf lipoprotein-deficient serum and 10 µM compactin in the absence of sterols. The other half of the roller bottles were switched to medium containing 10 percent newborn calf serum in the presence of 25-hydroxycholesterol (3 µg/ml) plus cholesterol (12 µg/ml). The cells were incubated for 24 hours and harvested for the preparation of poly(A)+ RNA (4). Adult adrenal glands (obtained from human cadavers at the time of removal of kidneys for transplantation) were frozen at −70°C until preparation of poly(A)+ RNA. The 5′ end-labeled, single-stranded 35P probe of 725 nucleotides was prepared by priming an M13 clone containing a fragment of the LDL receptor gene corresponding to nucleotides −686 to +66 (Fig. 2) with a 32P-labeled synthetic oligonucleotide complementary to nucleotides +19 to +43 (Fig. 2). The synthetic oligonueleotide was labeled at the 5′ end with [γ-32P]ATP (7000 Ci/mmol) and polynucleotide kinase (34) to a specific radioactivity of ~5 × 106 cpm/pmol. After primer extension with the Klenow fragment of DNA polymerase I in the presence of each of the four deoxynucleoside triphosphates (15 µM), the resulting double-stranded DNA was cleaved with Bam HI, and the radioactive probe fragment was purified by electrophoresis on a denaturing polyacrylamide gel and subsequent electroelution (34). A portion of the probe (105 cpm) was coprecipitated with the indicated amount of tRNA or poly(A)+ RNA in ethanol at −70°C. The precipitated material was resuspended in 20 µl of 80 percent formamide, 0.4M NaCl, 40 mM 1,4-piperazinediethane-sulfonic acid (pH 6.4), and 1 mM EDTA and hybridized at 65°C for 36 hours. The samples were diluted with 9 volumes of 0.25M NaCl, 30 mM potassium acetate (pH 4.5), 1 mM ZnSO4, and 5 percent glycerol; treated with 200 units of S1 nuclease (Bethesda Research Laboratories) at room temperature for 60 minutes (40); precipitated with ethanol; and analyzed on a sequencing gel. The protected fragments were compared with the adjacent dideoxy nucleotide–derived sequencing ladder obtained with the same primer and M13 template used to generate the probe. Numbers on the left denote the estimated nucleotide position corresponding to the 5’ end of the protected fragments according to the numbering scheme of Fig. 2. The sequence in the −78 to −97 region is shown on the right.
Fig. 4
Fig. 4
Sites of transcription initiation in the human LDL receptor gene as determined by primer extension analysis. Poly(A)+ RNA was obtained from human A-431 epidermoid carcinoma cells (lanes 1 and 2) or SV40-transformed human fibroblasts (lanes 3 and 4) that had been grown in the absence or presence of sterols. The RNA was annealed to a uniformly labeled, single-stranded 32P probe [corresponding to nucleotides +9 to +101, Fig. 2 and (4)] that served as a primer for extension by reverse transcriptase. The primer-extended products were subjected to electrophoresis through a 10 percent polyaerylamide-7M urea gel and detected after autoradiography for 60 hours at −20°C. The poly(A)+ RNA from A-431 cells and SV40-transformed fibroblasts cultured in the absence or presence of sterols was prepared as described in the legend of Fig. 3. A single-stranded, uniformly 32P-labeled primer complementary to nucleotides +9 to +101 of the human LDL receptor mRNA was derived from an M13 cDNA clone (38) as described (41). A portion (approximately 2 × 106 cpm) of the 93-nucleotide 32P-primer was precipitated with ethanol together with 10 µg of the indicated poly(A)+ RNA; resuspended in 5 µl of a buffer containing 50 mM tris-chloride (pH 8.0), 50 mM KCl, 5 mM MgCl2, and 20 mM dithiothreitol; and sealed in a glass capillary. The reaction mixture was denatured by boiling for 2 minutes, and primer-template complexes were allowed to form at 65°C for 4 hours. After this annealing period, the entire solution was transferred to a plastic microfuge tube containing 2.5 µl of a 1 mM solution of the four deoxynucleoside triphosphates and 6 units of avian myeloblastosis virus reverse transcriptase (Molecular Genetic Resources). Primer extension was allowed to occur for 50 minutes at 37°C and was stopped by the addition of 6 µl of a formamide-dye mix. The sample was boiled for 5 minutes, quickly chilled on ice, and subjected to electrophoresis on a sequencing gel. Size standards, shown on the right, were generated by dideoxy nucleotide sequencing (37) of a known M13 recombinant clone. Numbers on the left denote the calculated position corresponding to the limits of primer extension according to the numbering scheme of Fig. 2. The intense band at the bottom of lanes 1 to 4 represents the 32P-labeled primer used in the experiment.
Fig. 5
Fig. 5
Exon organization and protein domains in the human LDL receptor. The six domains of the protein are delimited by thick black lines and are labeled in the lower portion. The seven cysteine-rich, 40-amino acid repeats in the LDL binding domain (Fig. 6) are assigned the roman numerals I to VII. Repeats IV and V are separated by eight amino acids. The three cysteine-rich repeats in the EGF precursor homology domain (Fig. 8) are lettered A to C. The positions at which introns interrupt the coding region are indicated by arrowheads. Exon numbers are shown between the arrowheads.
Fig. 6
Fig. 6
Location of introns in the cysteine-rich repeat region of the binding domain of the LDL receptor. The amino acids constituting each of the seven repeat units are numbered in the left column according to the translated sequence of the receptor cDNA (4). (A) Optimal alignment was made by the computer programs ALIGN and RELATE (4) with modifications based on the location of introns. Amino acids that are present at a given position in more than 50 percent of the repeats are boxed and shown as a consensus on the bottom line. Cysteine residues (C) in the consensus sequence are underlined. The positions at which introns interrupt the coding sequence of the gene are denoted by the encircled amino acids. The single letter amino acid code translates to the three letter code as follows: A, Ala; C, Cys; D, Asp; E, Glu; F, Phe; G, Gly; H, His; I, Ile; K, Lys; L, Leu; M, Met; N, Asn; P, Pro; Q, Gln; R, Arg; S, Ser; T, Thr; V, Val; W, Trp; Y, Tyr. (B) The net charge of each of the amino acids in (A) is shown. All of the conserved amino acids that are charged bear a negative charge; none are positively charged.
Fig. 7
Fig. 7
Comparison of consensus sequence in the binding domain of the LDL receptor (Fig. 6A) with the homologous sequence from complement factor C9 (5).
Fig. 8
Fig. 8
Amino acid alignment of segments A, B, and C from the LDL receptor with homologous regions from the EGF precursor and several proteins of the blood clotting system. The number of amino acids comprising the variable region in the middle of each sequence is shown in parentheses. The standard one-letter amino acid abbreviations are used (Fig. 6). Amino acids that are present at a given position in more than 50 percent of the sequences are boxed and shown as a consensus at the bottom line. Cysteine residues (C) are underlined in the consensus sequence. Sequence data for the LDL receptor was taken from (4); sequence data for the other proteins were taken from the original references cited in (13, 23).

Similar articles

Cited by

References

    1. Goldstein JL, Anderson RGW, Brown MS. Nature (London) 1979;279:679. - PubMed
    1. Brown MS, Anderson RGW, Goldstein JL. Cell. 1983;32:663. - PubMed
    1. Russell DW, et al. ibid. 1984;37:577. - PubMed
    1. Yamamoto T, et al. ibid. 1984;39:27. - PubMed
    1. Stanley KK, et al. EMBO J. 1985;4:375. - PMC - PubMed
    2. DiScipio RG, et al. Proc. Natl. Acad. Sci. U.S.A. 1984;81:7298. - PubMed

Publication types

MeSH terms