Skip to content

CNNSplice: Robust Models for Splice Site Prediction Using Deep Convolutional Neural Networks

Notifications You must be signed in to change notification settings

OluwadareLab/CNNSplice

Repository files navigation

CNNSplice

CNNSplice: Robust Models for Splice Site Prediction Using Deep Convolutional Neural Networks

OluwadareLab, University of Colorado, Colorado Springs

Access Web Server: http://www.cnnsplice.online


Developers:
Algorithm and Model:
              Victor Akpokiro
              Department of Computer Science
              University of Colorado, Colorado Springs
              Email: vakpokir@uccs.edu

Web Server:
               M. A. Mohit Chowdhury (hchowdhu@uccs.edu), Samuel Olowofila(solowofi@uccs.edu) and Raisa Nusrat (rnusrat@uccs.edu)

Contact:
              Oluwatosin Oluwadare, PhD
              Department of Computer Science
              University of Colorado, Colorado Springs
              Email: ooluwada@uccs.edu


1. Build Instruction:

CNNSplice can be run in a Docker-containerized environment locally on users computer. Before cloning this repository and attempting to build, the Docker engine, If you are new to docker here is a quick docker tutorial for beginners.
To install and build TADMaster follow these steps.

  1. Clone this repository locally using the command git clone https://github.com/OluwadareLab/CNNSplice.git.
  2. Pull the CNNSplice docker image from docker hub using the command docker pull oluwadarelab/cnnsplice:latest. This may take a few minutes. Once finished, check that the image was sucessfully pulled using docker image ls.
  3. Run the CNNSplice container and mount the present working directory to the container using docker run -v ${PWD}:${PWD} -p 8050:8050 -it oluwadarelab/cnnsplice.
  4. cd to your file directory.

Exciting! You can now access CNNSplice locally.

2. Dependencies:

Skip this step if you followed the Docker instruction Above
CNNSplice is developed in Python3. All dependencies are included in the Docker environment. We have attached the requirement file for the list of dependencies. For local install of dependencies from the requirement.txt file for virtual environment usage, use command pip install -r requirement.txt from the current working directory.

  • Our constructed dataset permits a Sequence Length of 400

3. Training Usage:

Usage: To train, type in the terminal python train.py -n "model_name" -m mode
For Example: python train.py -n "output_name" -m "balanced"

  • Arguments:

    • output_name: A user specified string for output naming convention
    • mode: A string to specify either balanced or imbalanced input dataset, i.e ("balanced" or "imbalanced")
  • Outputs:
    The outputs of training includes:

    • .h5: The deepslicer model file.
    • .txt: The output files (.txt) containig the evaluation metrics results is stored in the log directory.

4. Testing Usage:

For Testing, use python test.py -n "output_name" -m mode("balanced" or "imbalanced")
For Example: python test.py -n "output_name" -m "balanced"

  • Arguments:

    • output_name: A user specified string for output naming convention
    • mode: A string to specify either balanced or imbalanced input dataset, i.e ("balanced" or "imbalanced")
  • Outputs:
    The outputs of testing includes:

    • .txt: The output files (.txt) containig the evaluation metrics results is stored in the log directory.

5. Note:

  • Dataset sequence length is 400.
  • Ensure you have a log directory for text file storage
  • Genomic sequence input data should be transfomed using one-hot encoding.

About

CNNSplice: Robust Models for Splice Site Prediction Using Deep Convolutional Neural Networks

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages