This project was developed in the context of 'Natural Language Processing with Disaster Tweets' Kaggle competition (https://www.kaggle.com/c/nlp-getting-started/overview). The goal of this competition is to build a Machine Learning model which can be used to monitor Tweets and predict those which are about real disasters and those that are not.
The main objectives of this project are the following:
- Perform basic Exploratory Data Analysis (EDA) on the data to gain better insights
- Parse Tweets and apply text processing, in order to remove unnecessary characters
- Use TF-ID to transform Tweet Documents into features
- Build Multinomial Naive Bayes to build the final Machine Learning model.