This is a script that can download cross referenced databases, parse them and create a unique fully annotated tsv file to replace gene_xref (i.e. for ANNOVAR). This script could be useful to update these databases and be added to the Achabilarity container (custom_database.txt).
To make it work, git clone this repository and do
sh updater.sh https://data.omim.org/downloads/my-Registration-Code/genemap2.txt
- python (3.6 tested)
- pandas library
- HGNC Approved Gene Name
- GnomAD constraint score (oe for LoF, missense and synonymous variants with confidence interval)
- UniProt database (gene function, tissue specificity, involvment in disease)
- OMIM (phenotype columns of genemap2)
NB: For OMIM database, you need to ask for access (https://www.omim.org/downloads) and give the link for genemap2.txt as argument for the updater.sh
A Jupyter Notebook to explain how this script work is available in this repository AddDB_updater.ipynb
Montpellier Bioinformatique pour le Diagnostic Clinique (MoBiDiC)
CHU de Montpellier
France