Skip to content

Codebase for inferring single nucleotide activation and repression maps using deep-learning

License

Notifications You must be signed in to change notification settings

Shamir-Lab/EnhancerSilencerDL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 Cannot retrieve latest commit at this time.

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Codebase for inferring single nucleotide activation and repression maps using deep-learning

How to use

input: data file, and target directory for the results

    A data file includes training, validation and test sets. 
    Each N-sample set is represented by:
       An N*1000*7 matrix:
       4 features of one-hot-encode DNA sequence matrix, followed by nucleotide-resolution signals for DNA methylation, H3K27ac and H3K4me1.
       An N*1000 target matrix: a per-position value that is a positive activation signal for enhancers, negative repression signal for silencers, and 0 otherwise.
       An N*3 target class matrix.

Download the data and decompress into 'data' folder:

train,validation and test data: data.hdf5.gz

Exploratory data: data.left_out.hdf5.gz

Training a regression model:

    python train.deeptact.reg.py  ./data/data.hdf5 ./output/

output: in the directory ./output/

    mae.txt
    model_weights.reg.hdf5

Training a classification model:

    python train.deeptact.py  ./data/data.hdf5 ./output/

output: in the directory ./output/

    auc.txt
    model_weights.class.hdf5

Predict on left out dataset:

    python predict.py ./data/data.left_out.hdf5 ./output/

output: in the directory ./output/

    predicted_classes.pickle

About

Codebase for inferring single nucleotide activation and repression maps using deep-learning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages