This repository containes demo applications and projects I worked on while getting to know data science and data engineering concepts.
-
The goal of this project is to provide a solution for Business Inteligence team who needs to perform a market study on publicly available data of their competitors. The project features extraction, transformation and load of Amazon Customer Reviews dataset and Netflix Prize dataset with Pyspark to a data warehouse (Amazon Redshift). The execution of the task was demonstrated by using AWS Step Functions coded with AWS CloudFormation.
* * *
-
Prediction of Students Performance on Final Exam
Machine learning pipeline that performs column transformation, feature selection and model training with the objective to classify students as either likely to fail on final exam or not. Several models were considered to compare the performance on the test set. The project features data transformation (multiple table joins), exploratory data analysis, explanatory analysis of XGBoost model predictions and error (false negatives) analysis.
* * *
-
Clustering K-means model used for customer profile discovery for strategic approach during the marketing campaign. Features PCA analysis to visualize detected clusters and find out the most representable features in the dataset.
ML predictive model that classifies customers as wheather likely to respond to upcoming marketing campaign or not. The model used is an XGBoost classifier tuned by a random search of hyperparameters. Features exploratory data analysis and explanatory analysis of classification model predictions.
* * *
-
Pneumonia Detection from Chest X-Ray Images
The fine-tuned Convolutional Neural Network (CNN) that detects pneumonia on x-ray images. The project contains a visualization of representations learned by the network using activation maps technique. Also, it features a visual explanation of the classifier results with Grad-CAM (Gradient-weighted Class Activation Mapping). Refer to the Overview section for model performance summary.
* * *
-
Full-Stack Object Detection Application Hosted in the Cloud
Serverless application that uses a deep learning model YOLOv2 to detect objects on uploaded images.
The app is a single page website that accepts an input image for back-end serverless processing with AWS Lambda. It responds with the provided input image that has been marked with bouding boxes which represent detected objects. Each bounding box features a label that classifies the detected object.
* * *
-
Recognition of Hand Gestures Captured on Images
The project allows to identify a number shown with hand gesture captured on images. It uses a ResNet deep learning model and leverages pre-trained weights. The model accepts an input image and returns a probability vector with 6 elements, corresponding to numbers 0-5.
* * *
-
Face Verification and Recognition on Images
This project implements Inception algorithm for face verification and face recognition.
Face verification part accepts user input that to claim their identity and asseses whether the claimed identity is true or not. Model compared the input image with the ground truth records stored in a database. Test 1 was conveyed so that user input represented a true claimed identity, while during Test 2 a false claimed identity was introduced.
Face recognition part accepts a user's face image and returns its best prediction for their identity, based on a ground truth records stored in a database.
* * *
-
Speech Recognition on Synthesized Audio Files
The model detects trigger word "activate" in input audio files. It triggers a chiming sound when the probability ultrapasses a certain threshold. The model is a uni-directional Recurrent Neural Network with Gated Recurrent Unit (GRU). It was trained on synthesized audio files from noise, trigger word and non-trigger word sounds.
* * *
-
Serverless Note Taking Application
Project inspired by the open-source project Serverless Stack by Anomaly Innovations. They created a step-by-step guide to help you build a full-stack serverless application hosted on AWS.