Skip to content

Repo containing the practical part of my Master's Thesis (in Serbian)

Notifications You must be signed in to change notification settings

AAnzel/Master_thesis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Practical part of my Master Thesis

Subject: Determining protein N-glycosylation with machine learning methods

Master studies

For Serbian version of this document click here.

Author:

  • Aleksandar Anžel 1025/2018

Problem:

  • This repository contains source and data files for determining protein N-glycosylation with SVMs and fully connected neural networks. For each protein in the dataset, physical and chemical attributes are created to be used in a learning process. Different techniques were used to overcome an unbalanced data problem that is existent in the used data set.

Running jupyter-notebook files:

  • Every source file is adjusted for running inside Google Colaboratory environment.
  • Please use files from src/Refactored/ directory.

Data:

  • Information on used data can be found in Data_generator.ipynb, as well as in Master_rad.pdf.
  • Instructions for creating input data for Machine Learning models can be found in Master_rad.pdf.
  • The most used Python libraries: Biopython, SciKit-Learn, Keras, Numpy, Pandas, Matplotlib.