This is a script that can download cross referenced databases, parse them and create a unique fully annotated tsv file to replace gene_xref (i.e. for ANNOVAR). This script could be useful to update these databases and be added to the Achabilarity container (custom_database.txt).
To make it work, git clone this repository and do
sh updater.sh
- python (3.6 tested)
- pandas library
- HGNC Approved Gene Name
- GnomAD constraint score (oe for LoF, missense and synonymous variants with confidence interval)
- UniProt database (gene function, tissue specificity, involvment in disease)
- OMIM (phenotype columns of genemap2)
NB: For OMIM database, you need to ask for access and replace the link for genemap2.txt in the updater.sh
A Jupyter Notebook to explain who this script work is available in this repository AddDB_updater.ipynb
Montpellier Bioinformatique pour le Diagnostique Clinique (MoBiDiC)
CHU de Montpellier
France