Convolutional Neural Network Architectures for
Predicting DNA-Protein Binding
Zeng H., Edwards M.D., Gifford D. K.(2015) "Convolutional Neural Network Architectures for
Predicting DNA-Protein Binding".
Proceedings of Intelligent Systems for Molecular Biology (ISMB) 2016
Bioinformatics, 32(12):i121-i127. doi: 10.1093/bioinformatics/btw255.
Abstract: We present a systematic exploration of convolutional neural network architectures
for predicting DNA sequence binding using a large compendium of transcription
factor datasets. We identify the best-performing architectures by varying convolutional
neural network width, depth, and pooling designs. We find that adding
convolutional kernels to a network is important for motif discovery and the use
of local max-pooling is important for differentiating bound versus unbound sequences
when both sequences contain a factor’s cognate motif. We explore the
sufficiency of training data in the performance of these learning approaches, and
have created a flexible cloud-based framework that permits the rapid exploration
of alternative neural network architectures for problems in computational biology.
Source code and documenation
Genomics-tailored deep learning platform to efficiently perform hyper-parameter tuning, training and testing: Caffe-based, Keras-based
Amazon Elastic Cloud (EC2) launcher that efficiently deploys deep learning models (and any software capsulated in Docker) on the cloud: Github
Docker version of DeepBind that is runnable on any GPU machine: Github
Docker version of DeepSEA that is runnable on any GPU machine: Training new model, Making predictions
Other supplementary data for the paper
Caffe model specification for models compared: files .
Training and testing data: Motif Discovery, Motif Occupancy .
DeepBind's prediction on motif discovery task: files
For questions or to request additional data please contact
Haoyang Zeng (firstname.lastname@example.org), or David Gifford (email@example.com).
Last updated Nov. 2, 2016.