Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jun 15;3(4):e214.
doi: 10.1002/imt2.214. eCollection 2024 Aug.

Deep learning enhancing guide RNA design for CRISPR/Cas12a-based diagnostics

Affiliations

Deep learning enhancing guide RNA design for CRISPR/Cas12a-based diagnostics

Baicheng Huang et al. Imeta. .

Abstract

Rapid and accurate diagnostic tests are fundamental for improving patient outcomes and combating infectious diseases. The Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) Cas12a-based detection system has emerged as a promising solution for on-site nucleic acid testing. Nonetheless, the effective design of CRISPR RNA (crRNA) for Cas12a-based detection remains challenging and time-consuming. In this study, we propose an enhanced crRNA design system with deep learning for Cas12a-mediated diagnostics, referred to as EasyDesign. This system employs an optimized convolutional neural network (CNN) prediction model, trained on a comprehensive data set comprising 11,496 experimentally validated Cas12a-based detection cases, encompassing a wide spectrum of prevalent pathogens, achieving Spearman's ρ = 0.812. We further assessed the model performance in crRNA design for four pathogens not included in the training data: Monkeypox Virus, Enterovirus 71, Coxsackievirus A16, and Listeria monocytogenes. The results demonstrated superior prediction performance compared to the traditional experiment screening. Furthermore, we have developed an interactive web server (https://crispr.zhejianglab.com/) that integrates EasyDesign with recombinase polymerase amplification (RPA) primer design, enhancing user accessibility. Through this web-based platform, we successfully designed optimal Cas12a crRNAs for six human papillomavirus (HPV) subtypes. Remarkably, all the top five predicted crRNAs for each HPV subtype exhibited robust fluorescent signals in CRISPR assays, thereby suggesting that the platform could effectively facilitate clinical sample testing. In conclusion, EasyDesign offers a rapid and reliable solution for crRNA design in Cas12a-based detection, which could serve as a valuable tool for clinical diagnostics and research applications.

Keywords: CRISPR; Cas12a; convolutional neural network; crRNA design; deep learning; diagnostic.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Preparation and evaluation of the Cas12a‐based crRNA data set for deep learning models training. (A) The process of developing deep learning‐based crRNA design models, including high‐throughput data collection, machine learning, and validation. (B) Flowchart illustrating the process of data acquisition using the CRISPR fluorescence‐based assay. (C) Distribution of the number of mismatches in the data set for guide‐to‐target pairs, including 0, 1, and 2 mismatches. (D) Characterization of the distribution of pathogens in the training data set, including viruses and bacteria. (E)–(H), Distribution of base types denoted as “N1N2” within the “TTTN1‐N2” region of the protospacer adjacent motif (PAM) and its adjacent extending position. (I)–(M) Types of mutations in the guide‐to‐target pairs, including A‐N, T‐N, C‐N, G‐N, and deletion (D)‐N. (N) Distribution of the GC content at different positions within the crRNA data set. crRNA, CRISPR RNA; CRISPR, Clustered Regularly Interspaced Short Palindromic Repeats.
Figure 2
Figure 2
Development and evaluation of a deep learning model suitable for Cas12a diagnostic design. (A) A four‐step flowchart outlining the process of selecting and training deep learning models. (B) The performance comparison of models using the Spearman's correlation coefficient and Pearson's correlation coefficient. (C) The CNN12a model uses one‐hot encoding for all enumerated crRNA‐target DNA sequence pairs with a “TTTN” PAM. (D) The Kernel density estimation of true (x‐axis) and predicted values (y‐axis) of CNN12ae. (E) The density map of true activity by quartile of predicted value; x‐axis is the normalized true activity; y‐axis is the interquartile range of predictions; it shows that the distribution of true activity is more concentrated when the CNN12a predicted value is below the second quartile. (F) Receiver operating characteristic (ROC) curve for the hold‐out test set of CNN12ae, distinguishing pairs as either inactive or active (true activity = 3), with an AUC (area under the curve) of 0.8247 (p < 0.0001). CNN, convolutional neural network; crRNA, CRISPR RNA; FPR, false positive rate; PAM, protospacer adjacent motif.
Figure 3
Figure 3
Validation of EasyDesign performance across four pathogens. (A) Flowchart illustrating the experimental validation of EasyDesign through a comparative screening test on pathogen templates. (B) Comparison between the experimentally detected number of top crRNAs and the number of top crRNAs predicted by EasyDesign. (C)–(F) Comparative analysis of crRNA activity in experimental CRISPR fluorescence versus EasyDesign predictions, including Monkeypox Virus (MPXV), Enterovirus 71 (EV71), Coxsackievirus A16 (CV‐A16), and Listeria monocytogenes (L. monocytogenes). CRISPR, Clustered Regularly Interspaced Short Palindromic Repeats; crRNA, CRISPR RNA.
Figure 4
Figure 4
HPV clinical sample testing design via EasyDesign web server. (A) Flowchart illustrating the web‐based EasyDesign platform, including sequence input, crRNA, and amplification primer design for Cas12a‐based detection. (B) Presentation of the web‐based design interface, including sequence uploading, parameter selection, and the generation of candidate crRNA and amplification primer pairs. (C)–(H) Fluorescence detection kinetic curves representing six human papillomavirus (HPV) subtypes (HPV6, HPV11, HPV16, HPV18, HPV31, and HPV33) using synthetic DNA templates. The optimal amplification primers and crRNAs were generated utilizing the EasyDesign web‐based tool. Five optimal crRNAs and their corresponding primer pairs were generated for each template. The fluorescence graph illustrates the results obtained after a 30‐min incubation period. (I)–(N) Fluorescence detection results were obtained for clinical samples representing the six HPV subtypes after a 30‐min incubation period. The positive clinical samples were identified as follows: #S1 to #S3 for HPV6, #S4 to #S6 for HPV11, #S7 to #S10 for HPV16, #S11 and #S12 for HPV18, #S13 and #S14 for HPV31, and #S15 for HPV33. crRNA, CRISPR RNA; NC, negative control.

Similar articles

References

    1. Bouzid, D. , Zanella M.‐C., Kerneis S., Visseaux B., May L., Schrenzel J., and Cattoir V.. 2021. “Rapid Diagnostic Tests for Infectious Diseases in the Emergency Department.” Clinical Microbiology and Infection 27: 182–191. 10.1016/j.cmi.2020.02.024 - DOI - PMC - PubMed
    1. Gessain, Antoine , Nakoune Emmanuel, and Yazdanpanah Yazdan. 2022. “Monkeypox.” New England Journal of Medicine 387: 1783–1793. 10.1056/NEJMra2208860 - DOI - PubMed
    1. Thornhill, John P. , Barkati Sapha, Walmsley Sharon, Rockstroh Juergen, Antinori Andrea, Harrison Luke B., Palich Romain, et al. 2022. “Monkeypox Virus Infection in Humans Across 16 Countries ‐ April‐June 2022.” New England Journal of Medicine 387: 679–691. 10.1056/NEJMoa2207323 - DOI - PubMed
    1. Ackerman, Cheri M. , Myhrvold Cameron, Thakku Sri Gowtham, Freije Catherine A., Metsky Hayden C., Yang David K., Ye Simon H., et al. 2020. “Massively Multiplexed Nucleic Acid Detection With Cas13.” Nature 582: 277–282. 10.1038/s41586-020-2279-8 - DOI - PMC - PubMed
    1. Chen, Janice S. , Ma Enbo, Harrington Lucas B., Da Costa Maria, Tian Xinran, Palefsky Joel M., and Doudna Jennifer A.. 2018. “CRISPR‐Cas12a Target Binding Unleashes Indiscriminate Single‐Stranded DNase Activity.” Science 360: 436–439. 10.1126/science.aar6245 - DOI - PMC - PubMed

LinkOut - more resources