Predicted the sentiment, gender, emotion, and personality of each character using their dialogues across Marvel Universe movies. Got a trend across the movies and characters, and analyzed the results.
Made the dataset on our own by scraping the Marvel movie scripts, and mapping dialogues to characters. The dataset consists of 15 movies.
-
Sentiment Analysis: Used VADER sentiment analysis tool to predict the sentiment of each dialogue. The dialogues were classified into positive, negative, and neutral sentiments based on the compound score of the VADER.
-
Gender Prediction: We fine-tuned a RoBERTa transformer model by adding a single classification layer on top of the pre-trained model. This was done using the twitter gender dataset. After this, the transformer weights were frozen and we trained only the linear classification layer on the Marvel dataset.
-
Emotion Prediction: We used the GoEmotions dataset which has 27 unique emotions. We utilised the pre-trained BERT model for zero shot emotion classification on the Marvel Dataset.
-
Personality Prediction: Fine-tuned a RoBERTa transformer model by adding a single classification layer on top of the pre-trained model. This was done using the MBTI dataset which has 16 unique personality types.
For a detailed analysis of the results, please refer to the report
- Clone the repository
- Run the following command to install the required libraries:
pip install -r requirements.txt