Skip to content

Latest commit

 

History

History
31 lines (23 loc) · 2.31 KB

README.md

File metadata and controls

31 lines (23 loc) · 2.31 KB

Project Description

Predicted the sentiment, gender, emotion, and personality of each character using their dialogues across Marvel Universe movies. Got a trend across the movies and characters, and analyzed the results.

Data

Made the dataset on our own by scraping the Marvel movie scripts, and mapping dialogues to characters. The dataset consists of 15 movies.

Methodology

  1. Sentiment Analysis: Used VADER sentiment analysis tool to predict the sentiment of each dialogue. The dialogues were classified into positive, negative, and neutral sentiments based on the compound score of the VADER.

  2. Gender Prediction: We fine-tuned a RoBERTa transformer model by adding a single classification layer on top of the pre-trained model. This was done using the twitter gender dataset. After this, the transformer weights were frozen and we trained only the linear classification layer on the Marvel dataset.

  3. Emotion Prediction: We used the GoEmotions dataset which has 27 unique emotions. We utilised the pre-trained BERT model for zero shot emotion classification on the Marvel Dataset.

  4. Personality Prediction: Fine-tuned a RoBERTa transformer model by adding a single classification layer on top of the pre-trained model. This was done using the MBTI dataset which has 16 unique personality types.

Results

For a detailed analysis of the results, please refer to the report

How to run the code

  1. Clone the repository
  2. Run the following command to install the required libraries:
pip install -r requirements.txt
  1. Since there are 4 different models, the code is divided into 4 different notebooks. Run the following notebooks in the given order: