Skip to content

Natural Language Processing (NLP) project a.a. 2022-2023 at Politecnico di Milano (POLIMI). AUTEXTIFICATION: Automatic Text Identification, classify between model generated texts and human written texts

Notifications You must be signed in to change notification settings

laaners/Literata

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Literata

NLP project at Politecnico di Milano about AUTEXTIFICATION: Automatic Text Identification

Literata Team: Zheng Maria Yu, Alessio Hu, Jakub Jastrzębski, Joanna Rancew

The project AUTEXTIFICATION: Automatic Text Identification covers both binary classification to distinguish between generated text and human-written text, and multi-class classification to predict what language model generated particular text.

The notebook is structured into two main subtasks:

  • Subtask 1: Human or Generated,
  • Subtask 2: Which Generation model.

Each subtask begins with data preprocessing and visualizations. Subsequently, a comprehensive collection of models trained on the dataset is presented, ranging from simple machine learning classifiers to various neural networks and transformers.

To facilitate navigation within the notebook, use a table of contents on the left (Colab) to easy access to different sections.

If you want to run the code on your own, remember to use GPU for transformers!

About

Natural Language Processing (NLP) project a.a. 2022-2023 at Politecnico di Milano (POLIMI). AUTEXTIFICATION: Automatic Text Identification, classify between model generated texts and human written texts

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%