Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Feb 4;15(4):1006.
doi: 10.3390/cancers15041006.

Geographic Variation and Risk Factor Association of Early Versus Late Onset Colorectal Cancer

Affiliations

Geographic Variation and Risk Factor Association of Early Versus Late Onset Colorectal Cancer

Weichuan Dong et al. Cancers (Basel). .

Abstract

The proportion of patients diagnosed with colorectal cancer (CRC) at age < 50 (early-onset CRC, or EOCRC) has steadily increased over the past three decades relative to the proportion of patients diagnosed at age ≥ 50 (late-onset CRC, or LOCRC), despite the reduction in CRC incidence overall. An important gap in the literature is whether EOCRC shares the same community-level risk factors as LOCRC. Thus, we sought to (1) identify disparities in the incidence rates of EOCRC and LOCRC using geospatial analysis and (2) compare the importance of community-level risk factors (racial/ethnic, health status, behavioral, clinical care, physical environmental, and socioeconomic status risk factors) in the prediction of EOCRC and LOCRC incidence rates using a random forest machine learning approach. The incidence data came from the Surveillance, Epidemiology, and End Results program (years 2000-2019). The geospatial analysis revealed large geographic variations in EOCRC and LOCRC incidence rates. For example, some regions had relatively low LOCRC and high EOCRC rates (e.g., Georgia and eastern Texas) while others had relatively high LOCRC and low EOCRC rates (e.g., Iowa and New Jersey). The random forest analysis revealed that the importance of community-level risk factors most predictive of EOCRC versus LOCRC incidence rates differed meaningfully. For example, diabetes prevalence was the most important risk factor in predicting EOCRC incidence rate, but it was a less important risk factor of LOCRC incidence rate; physical inactivity was the most important risk factor in predicting LOCRC incidence rate, but it was the fourth most important predictor for EOCRC incidence rate. Thus, our community-level analysis demonstrates the geographic variation in EOCRC burden and the distinctive set of risk factors most predictive of EOCRC.

Keywords: colorectal cancer; early-onset; geographic information system; machine learning; random forest; regionalization; risk factor.

PubMed Disclaimer

Conflict of interest statement

Weichuan Dong and Siran M. Koroukian reported receiving grants from American Cancer Society (RWIA-20-111-02 RWIA) and by contracts from Cleveland Clinic Foundation, including a subcontract from Celgene Corporation. Siran M. Koroukian was also supported by grants from the Centers for Disease Control and Prevention, U48 DP005030-05S1 and U48 DP006404-03S7; National Institutes of Health (R15 NR017792, UH3-DE025487, and R01 AG074946-01) and American Cancer Society (132678-RSGI-19-213-01-CPHPS). Uriel Kim is supported by grants from the National Institute of General Medical Sciences (5T32GM007250), National Center for Advancing Translational Sciences (5TL1TR002549), and the PhRMA Foundation (PDHO18). Johnie Rose reported receiving grants from NIH/National Cancer Institute during the conduct of the study; holding stock in Vinya Intelligence Inc outside the submitted work; and having a patent issued for US 270,799 B2 “In-home remote monitoring systems and methods for predicting health status decline”. No other disclosures were reported.

Figures

Figure 1
Figure 1
Geographic distribution of (A) EOCRC incidence rate, and (B) LOCRC incidence rate. Notes: Incidence rates are age-adjusted and classified by quartile. Data for states in white are not available in SEER.
Figure 1
Figure 1
Geographic distribution of (A) EOCRC incidence rate, and (B) LOCRC incidence rate. Notes: Incidence rates are age-adjusted and classified by quartile. Data for states in white are not available in SEER.
Figure 2
Figure 2
Variable importance and direction of association of risk factors for (A) EOCRC incidence rate, and (B) LOCRC incidence rate. Notes: (1) The most important variable is set to 100%. The importance of the rest of the variables is scaled relative to the most important variable. (2) Direction of association was determined by a linear correlation coefficient. (3) Categories of variable importance (i.e., high to low) was classified by the Jenks natural breaks method.
Figure 3
Figure 3
Comparisons of the top ten risk factors based on variable importance between EOCRC and LOCRC models.

Similar articles

Cited by

References

    1. Siegel R.L., Miller K.D., Goding Sauer A., Fedewa S.A., Butterly L.F., Anderson J.C., Cercek A., Smith R.A., Jemal A. Colorectal cancer statistics, 2020. CA Cancer J. Clin. 2020;70:145–164. doi: 10.3322/caac.21601. - DOI - PubMed
    1. Wang W., Chen W., Lin J., Shen Q., Zhou X., Lin C. Incidence and characteristics of young-onset colorectal cancer in the United States: An analysis of SEER data collected from 1988 to 2013. Clin. Res. Hepatol. Gastroenterol. 2019;43:208–215. doi: 10.1016/j.clinre.2018.09.003. - DOI - PubMed
    1. Archambault A.N., Lin Y., Jeon J., Harrison T.A., Bishop D.T., Brenner H., Casey G., Chan A.T., Chang-Claude J., Figueiredo J.C., et al. Nongenetic Determinants of Risk for Early-Onset Colorectal Cancer. JNCI Cancer Spectr. 2021;5:pkab029. doi: 10.1093/jncics/pkab029. - DOI - PMC - PubMed
    1. Hayes R.B. Advances in Understanding Early-Onset Colorectal Cancer. Cancer Epidemiol. Biomark. Prev. 2021;30:1775–1777. doi: 10.1158/1055-9965.EPI-21-0844. - DOI - PubMed
    1. Sinicrope F.A. Increasing Incidence of Early-Onset Colorectal Cancer. N. Engl. J. Med. 2022;386:1547–1558. doi: 10.1056/NEJMra2200869. - DOI - PubMed

LinkOut - more resources