Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Sep 30:5:5114.
doi: 10.1038/ncomms6114.

Functional annotation of colon cancer risk SNPs

Affiliations

Functional annotation of colon cancer risk SNPs

Lijing Yao et al. Nat Commun. .

Abstract

Colorectal cancer (CRC) is a leading cause of cancer-related deaths in the United States. Genome-wide association studies (GWAS) have identified single nucleotide polymorphisms (SNPs) associated with increased risk for CRC. A molecular understanding of the functional consequences of this genetic variation has been complicated because each GWAS SNP is a surrogate for hundreds of other SNPs, most of which are located in non-coding regions. Here we use genomic and epigenomic information to test the hypothesis that the GWAS SNPs and/or correlated SNPs are in elements that regulate gene expression, and identify 23 promoters and 28 enhancers. Using gene expression data from normal and tumour cells, we identify 66 putative target genes of the risk-associated enhancers (10 of which were also identified by promoter SNPs). Employing CRISPR nucleases, we delete one risk-associated enhancer and identify genes showing altered expression. We suggest that similar studies be performed to characterize all CRC risk-associated enhancers.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Identification of potential functional SNPs for CRC.
(a) Shown is the number of SNPs identified by FunciSNP in each of three categories for 25 colon cancer risk loci (see Table 1 for information on each CRC risk SNP). For exons, only non-synonymous SNPs are reported; parentheses indicated the number of SNPs that are predicted to be damaging; see Table 2 for a list of the expressed genes associated with the correlated SNPs. For TSS regions, the region from −2 kb to +2 kb relative to the start site of all transcripts annotated in GENCODE V15, including coding genes and non-coding RNAs was used; see Table 2 for a list of expressed transcripts associated with the correlated SNPs. (b) For H3K27Ac analyses, ChIP-seq data from normal sigmoid colon and HCT116 tumour cells were used; see Table 3 for further analysis of distal regions harbouring SNPs in normal and tumour colon cells. The SNPs having an r2>0.1 that overlapped with H3K27Ac sites were identified separately for HCT116 and sigmoid colon data sets. Because more than one SNPs could identify the same H3K27Ac-marked region, the SNPs were then collapsed into distinct H3K27Ac peaks. The sites that were within ±2 kb of a promoter region were removed to limit the analysis to distal elements. To obtain a more stringent set of enhancers, those regions having only SNPs with r2<0.5 were removed. This remaining set of 68 distal H3K27Ac sites were contained within 19 of the 25 risk loci. Visual inspection to identify only the robust enhancers having linked SNPs not at the margins reduced the set to 27 enhancers located in 9 of the 25 risk loci; an additional enhancer was identified in SW480 cells (see Table 3 for the genomic locations of all 28 enhancers). Colour key: green=SNPs or H3K27Ac sites unique to normal colon, red=unique to colon tumour cells, blue=present in both normal and tumour colon.
Figure 2
Figure 2. Expression of risk-associated genes in colon cells.
The left panel indicates if a transcript was identified by a SNP located in an exon or a TSS or is nearby a risk-associated enhancer; the middle panel shows the expression values of each of the 41 transcripts in sigmoid colon or HCT116 tumour cells; the right panel shows the fold change of each transcript in the tumour cells (positive indicates higher expression in the tumour).
Figure 3
Figure 3. Linking a transcript to an enhancers using TCGA data.
(a) Shown is the location of enhancer 19 and the position of the three SNPs (in red) identified in the eQTL studies and two other SNPs (in blue) identified by the FunciSNP analysis but not present on the SNParray, in relation to the H3K27Ac, RNA-seq and TCF7L2 ChIP-seq data for that region. Also shown are the ENCODE ChIP-seq transcription factor tracks from the University of California, Santa Cruz genome browser. (b) The expression of the TMED6 RNA is shown for samples having homozygous or heterozygous alleles for three SNPs in enhancer 19. The upper and lower quartiles of the box plots are the 75th and 25th percentiles, respectively. The whisker top and bottom are 90th and 10th percentiles, respectively. The horizontal line through the box is median value. The P-value corresponds to the regression coefficient based on the residue expression level and the germline genotype. Sample size is listed under each genotype. (c) A schematic of the gene structure in the genomic region around enhancer 19 (yellow box) is shown; the arrows indicate the direction of transcription of each gene. The three genes in the enhancer 19 region that showed differential expression in normal versus tumour colon samples (Table 5) are indicated; of these, only TMED6 was identified in the eQTL analysis.
Figure 4
Figure 4. Identification of genes affected by deletion of enhancer 7.
(a) Shown are the expression differences (x axis) and the significance of the change (y axis) of the genes in the control HCT116 versus HCT116 cells having complete deletion of enhancer 7. The Illumina Custom Differential Expression Algorithm was used to determine P-values to identify the significantly altered genes; three replicates each for the control and deleted cells were used. Genes on chromosome 8 (the location of enhancer 7) are shown in blue. The spot representing the MYC gene is indicated by the arrow. (b) Shown are all genes on chromosome 8 that change in expression and the 10 genes showing the largest changes in expression upon deletion of enhancer 7. The location of the enhancer is indicated and the chromosome number is shown on the outside of the circle. (c) The genes identified as potential targets using TCGA expression data are indicated; of these, MYC is the only showing a change in gene expression upon deletion of the enhancer.
Figure 5
Figure 5. Summary of identified candidate genes correlated with increased risk for CRC.
Shown are the 80 candidate genes identified in this study. For the gene names, green means that it was only identified as a potential enhancer target, the other genes were identified as direct targets either by an exon SNP or a TSS SNP; the putative enhancer target genes were selected as described in the text. For each tag SNP, the relative number of SNPs that identified an exon (red portion), a TSS (blue portion), or an enhancer (green portion) is shown by the bar graph. The nine genomic regions that harbour CRC risk enhancers are shown by the green rectangles outside the circle.

Similar articles

Cited by

References

    1. Fearon E. R. Molecular genetics of colorectal cancer. Annu. Rev. Pathol. 6, 479–507 (2011). - PubMed
    1. TheCancerGenomeAtlas. Comprehensive molecular characterization of human colon and rectal cancer. Nature 487, 330–337 (2012). - PMC - PubMed
    1. Hindorff L. A. et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl Acad. Sci. USA 106, 9362–9367 (2009). - PMC - PubMed
    1. Manolio T. A. Genomewide association studies and assessment of the risk of disease. New Engl. J. Med. 363, 166–176 (2010). - PubMed
    1. Zanke B. W. et al. Genome-wide association scan identifies a colorectal cancer susceptibility locus on chromosome 8q24. Nat. Genet. 39, 989–994 (2007). - PubMed

Publication types

MeSH terms