Literata

NLP project at Politecnico di Milano about AUTEXTIFICATION: Automatic Text Identification

Literata Team: Zheng Maria Yu, Alessio Hu, Jakub Jastrzębski, Joanna Rancew

The project AUTEXTIFICATION: Automatic Text Identification covers both binary classification to distinguish between generated text and human-written text, and multi-class classification to predict what language model generated particular text.

The notebook is structured into two main subtasks:

Subtask 1: Human or Generated,
Subtask 2: Which Generation model.

Each subtask begins with data preprocessing and visualizations. Subsequently, a comprehensive collection of models trained on the dataset is presented, ranging from simple machine learning classifiers to various neural networks and transformers.

To facilitate navigation within the notebook, use a table of contents on the left (Colab) to easy access to different sections.

If you want to run the code on your own, remember to use GPU for transformers!

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Literata.ipynb		Literata.ipynb
README.md		README.md
dataset.zip		dataset.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Literata

NLP project at Politecnico di Milano about AUTEXTIFICATION: Automatic Text Identification

About

Releases

Packages

Languages

laaners/Literata

Folders and files

Latest commit

History

Repository files navigation

Literata

NLP project at Politecnico di Milano about AUTEXTIFICATION: Automatic Text Identification

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages