Skip to content

A multilingual DeBERTa model fine-tuned on political communication to classify discrete emotions

License

Notifications You must be signed in to change notification settings

tweedmann/pol_emo_mDeBERTa2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pol_emo_mDeBERTa2: Discrete-Emotional-Classifier

DOWNLOAD

Start by downloading v1.0.0 on the right side under "Releases". Please download the zip-file which contains all necessary files to apply the fine-tuned transformer model.

Model description

The model is based on the data described in the following paper:

Widmann, Tobias, and Maximilian Wich. 2022. "Creating and Comparing Dictionary, Word Embedding, and Transformer-Based Models to Measure Discrete Emotions in German Political Text." Political Analysis, June, 1--16. https://doi.org/10.1017/pan.2022.15.

However, different to the model described in the paper, pol_emo_mDeBERTa2 has been trained (80%) and tested (20%) on the whole set of crowd-coded data: 19620 sentences from political communication. Furthermore, the base model used is a multilingual mDeBERTa trained on data from 100 languages (mDeBERTa-v3-base from Microsoft). Thus, in comparison to the model presented in the paper, the fine-tuned model pol_emo_mDeBERTa2, can be applied to more languages than German.

You can learn more about the efficiency of DeBERTa using ELECTRA-Style pre-training with Gradient Disentangled Embedding Sharing in their research paper. You can learn more about the CC100 dataset that has been used to train the base model here.

The model classifies text into to the following eight emotions: anger, fear, disgust, sadness, joy, enthusiasm, pride, and hope.

Python code is provided below.

Eval results

The model was evaluated using 20% of the test dataset described in the research paper. The metrics described below are macro precision, recall, and F1 scores.

Emotion Precision Recall F1 Score
anger 0.782 0.775 0.778
fear 0.725 0.664 0.689
disgust 0.697 0.768 0.727
sadness 0.776 0.72 0.744
joy 0.819 0.84 0.829
enthusiasm 0.691 0.693 0.692
pride 0.79 0.666 0.71
hope 0.647 0.603 0.62
Macro Avg 0.741 0.716 0.724

In addition, to test its applicability to other languages than German, I applied the model to more than 1,200 test sentences in English, Spanish, and French. The sentences have been created by an generative AI model, resembling stereotypical statements from political communication (with and without discrete emotional appeals). The performance metrics for this exercise can be seen below. As indicated, the model achieves high performance in languages other than German:

Emotion Precision Recall F1 Score
anger 0.837 0.842 0.84
fear 0.866 0.803 0.831
disgust 0.92 0.604 0.657
sadness 0.876 0.88 0.878
joy 0.863 0.963 0.906
enthusiasm 0.782 0.622 0.663
pride 0.95 0.852 0.894
hope 0.91 0.707 0.768
Macro Avg 0.875 0.784 0.805

Citation

Please always cite the following paper when using the model:

Widmann, Tobias, and Maximilian Wich. 2022. "Creating and Comparing Dictionary, Word Embedding, and Transformer-Based Models to Measure Discrete Emotions in German Political Text." Political Analysis, June, 1--16. https://doi.org/10.1017/pan.2022.15.

Ideas for cooperation or questions?

If you have questions or ideas for cooperation, contact me at widmann{at}ps{dot}au{dot}dk

Model Use

# torch libraries
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader

# pytorch libraries
import pytorch_lightning as pl
from torchmetrics import F1Score
from torchmetrics.functional import accuracy, auroc #F1Score #f1
from pytorch_lightning.callbacks import ModelCheckpoint, EarlyStopping
from pytorch_lightning.loggers import TensorBoardLogger

# transformers libraries
from transformers import AutoTokenizer, DebertaV2Model, AdamW, get_linear_schedule_with_warmup

import tqdm

import pandas as pd
#set working directory to the downloaded folder
import os
os.chdir('/path/to/your/directory')
#define functions necessary to apply mDeBERTa model
LABEL_COLUMNS = ['anger_v2', 'fear_v2', 'disgust_v2', 'sadness_v2', 'joy_v2', 'enthusiasm_v2', 'pride_v2', 'hope_v2']
BASE_MODEL_NAME = "microsoft/mdeberta-v3-base"
tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL_NAME)
batch_size = 8
device = "cuda" if torch.cuda.is_available() else "cpu"


class CrowdCodedTagger(pl.LightningModule):

  def __init__(self, n_classes: int, n_training_steps=None, n_warmup_steps=None):
    super().__init__()
    self.bert = DebertaV2Model.from_pretrained(BASE_MODEL_NAME, return_dict=True)
    self.classifier = nn.Linear(self.bert.config.hidden_size, n_classes)
    self.n_training_steps = n_training_steps
    self.n_warmup_steps = n_warmup_steps
    self.criterion = nn.BCELoss()

  def forward(self, input_ids, attention_mask, labels=None, token_type_ids=None):
    output = self.bert(input_ids, attention_mask=attention_mask)
    output = self.classifier(output.last_hidden_state[:, 0])
    output = torch.sigmoid(output)
    loss = 0
    if labels is not None:
        loss = self.criterion(output, labels)
    return loss, output

  def training_step(self, batch, batch_idx):
    input_ids = batch["input_ids"]
    attention_mask = batch["attention_mask"]
    labels = batch["labels"]
    loss, outputs = self(input_ids, attention_mask, labels)
    self.log("train_loss", loss, prog_bar=True, logger=True)
    return {"loss": loss, "predictions": outputs, "labels": labels}

  def validation_step(self, batch, batch_idx):
    input_ids = batch["input_ids"]
    attention_mask = batch["attention_mask"]
    labels = batch["labels"]
    loss, outputs = self(input_ids, attention_mask, labels)
    self.log("val_loss", loss, prog_bar=True, logger=True)
    return loss

  def test_step(self, batch, batch_idx):
    input_ids = batch["input_ids"]
    attention_mask = batch["attention_mask"]
    labels = batch["labels"]
    loss, outputs = self(input_ids, attention_mask, labels)
    self.log("test_loss", loss, prog_bar=True, logger=True)
    return loss

  def training_epoch_end(self, outputs):

    labels = []
    predictions = []
    for output in outputs:
      for out_labels in output["labels"].detach().cpu():
        labels.append(out_labels)
      for out_predictions in output["predictions"].detach().cpu():
        predictions.append(out_predictions)

    labels = torch.stack(labels).int()
    predictions = torch.stack(predictions)

    for i, name in enumerate(LABEL_COLUMNS):
      class_roc_auc = auroc(predictions[:, i], labels[:, i])
      self.logger.experiment.add_scalar(f"{name}_roc_auc/Train", class_roc_auc, self.current_epoch)

  def configure_optimizers(self):

    optimizer = AdamW(self.parameters(), lr=2e-5) #DEFINING LEARNING RATE

    scheduler = get_linear_schedule_with_warmup(
      optimizer,
      num_warmup_steps=self.n_warmup_steps,
      num_training_steps=self.n_training_steps
    )

    return dict(
      optimizer=optimizer,
      lr_scheduler=dict(
        scheduler=scheduler,
        interval='step'
      )
    )

# Define function for inference
def predict_labels(df):
    input_text = df['sentence'].tolist()
    num_inputs = len(input_text)
    num_batches = (num_inputs - 1) // batch_size + 1
    batched_input = [input_text[i * batch_size:(i + 1) * batch_size] for i in range(num_batches)]
    batched_output = []

    for i, batch in enumerate(tqdm.tqdm(batched_input)):
        encoded_input = tokenizer(batch, padding=True, truncation=True, max_length=512, return_tensors='pt')
        outputs = model(**encoded_input.to(device))

        # Extract the decimal numbers from the tensor
        tensor_values = outputs[1].tolist()
        decimal_numbers = [[num for num in sublist] for sublist in tensor_values]

        # Create Pandas DataFrame
        output_df = pd.DataFrame(decimal_numbers, columns=LABEL_COLUMNS)

        # Apply threshold function to DataFrame
        threshold = 0.65
        threshold_fn = lambda x: 1 if x >= threshold else 0
        output_df = output_df.applymap(threshold_fn)

        # Concatenate input DataFrame with output DataFrame
        input_df = df.iloc[i * batch_size:(i + 1) * batch_size].reset_index(drop=True)
        output_df = pd.concat([input_df, output_df], axis=1)

        batched_output.append(output_df)


    # Concatenate all batches into a single output DataFrame
    output_df = pd.concat(batched_output, ignore_index=True)

    return output_df
#provide an input dataframe, for example by loading in a csv file
#by default, the column in the dataframe including sentences to be classified should be called "sentence" (can be adjusted above)
input_df = pd.read_csv("./example_data.csv")
input_df
# putting model into evaluation mode and load local fine-tuned model
model = CrowdCodedTagger(n_classes=8)
model.load_state_dict(torch.load("./model/pytorch_model.pt"), strict = False)
model.to(device)
model.eval() 
#apply function to the input_df
results = predict_labels(input_df)
print(results)
#save results as a csv file in the working directory
result.to_csv('example_results.csv', index=False)