Master studies
For Serbian version of this document click here.
- Aleksandar Anžel 1025/2018
- This repository contains source and data files for determining protein N-glycosylation with SVMs and fully connected neural networks. For each protein in the dataset, physical and chemical attributes are created to be used in a learning process. Different techniques were used to overcome an unbalanced data problem that is existent in the used data set.
- Every source file is adjusted for running inside Google Colaboratory environment.
- Please use files from src/Refactored/ directory.
- Information on used data can be found in Data_generator.ipynb, as well as in Master_rad.pdf.
- Instructions for creating input data for Machine Learning models can be found in Master_rad.pdf.
- The most used Python libraries: Biopython, SciKit-Learn, Keras, Numpy, Pandas, Matplotlib.