A curated list of my data science projects. Check out my LinkedIn profile, certificates, and website for more information.
Note: Data used in the projects is for demonstration purposes only.
-
-
PyMolSAR: A Generalizable Tool for Small-Molecule Property Prediction: A Python package to calculate molecular descriptors and test out several different supervised learning algorithms to build the most-appropriate Quantitative Structure-Activity Relationship (QSAR) model that accurately predicts the chemical properties of small molecules.
-
E-Commerce Product Classifier: An ensemble model for classifying e-commerce products into product categories using a bag-of-words model for text data and a pre-trained VGG-16 model for image data.
-
RMS Titanic Passenger Survival: Exploratory Analysis and a classification model to predict the survival of the passengers onboard RMS Titanic.
-
Predicting Housing Prices in Ames, Iowa: A model to predict the value of a given house in Ames, Iowa using various statistical analysis tools. Identified the best price that a client can sell their house utilizing machine learning.
-
Classifying Iris flowers: Various classification models with metrics to classify Iris flowers into 3 classes
-
Classifying Mushrooms: Exploratory analysis and classification models to classify mushrooms as edible or poisonous
-
Movie Recommendations: Content-based recommender for movies based on the similarity of the items being recommended.
Tools: scikit-learn, Keras, Pandas
-
-
- Disasters on Social Media: Exploratory analysis and classification models like bag-of-words, Tf-Idf, Word2Vec, and CNN to detect which tweets are about a disastrous event as opposed to an irrelevant topic with accuracies of ~80%
Tools: NLTK, scikit-learn, Keras, Pandas
-
-
Forecasting of Vehicle Traffic at Signals: An additive model to forecast non-linear hourly traffic with yearly and weekly seasonality
-
Clean Energy Demand in the United States: An auto-regressive (AR) forecasting model and a Tableau dashboard to visualize the past and future clean energy demand of the United States
Tools: Pandas, Prophet, Tableau
-
-
-
Network Analysis of Nuclear Energy Research: Web of Science web-scraping script to retrieve meta-data of published papers and a Tableau dashboard to track collaborations in nuclear energy research from 2000-2016.
-
Indeed Job Postings Analysis: Indeed web-scraping script to retrieve job details of latest postings.
-
Academic Faculty Salary Analysis: Google Scholar web-scraping script to retrieve citation metrics of researchers and correlation analysis of total number of citations with salary of faculty at the University of Washington.
Tools: Pandas, Tableau, Selenium, Beautiful Soup, Seaborn, Matplotlib
-
Contact me at rahul.avadhoot@gmail.com to talk about my portfolio, work opportunities, or collaborations.