Skip to content

amrathod/Deep-Learning-Project-2

Repository files navigation

Deep Fake Audio Classifier

This is the submission for the Project 2 and 3 for the graduate level Deep Learning course (GY-7123) at NYU Tandon.

This is currently a WIP

Developers: Abhishek Rathod, Jake Gus, Utkarsh Shekhar
Course: ECE-GY 7123 Spring 2022

Overview

The objective of this project is create a classifier for differentiating between real audio and generated audio (created using 6 different architectures)

Model Architechture and implementation

The model currently uses 4 convolution blocks (each containg a conv, ReLU and BatchNorm layer) followed by average pooling and a linear classifier.

Dataset

The dataset for the audio clips is available at :

  1. [LJSpeech](https://keithito.com/LJ-Speech-Dataset/).
  2. [WaveFake](https://zenodo.org/record/5642694#.YmYABNPMKre) [1]

Recreating the results

  1. After Downloading the Datasets please place it in the directory as specified in "data_script.py" . This scripts helps sanitize and pre-process the audio data.
  2. Run the code using "audio_detect.py"

References

[1] Frank, J., & Schönherr, L. (2021). WaveFake: A Data Set to Facilitate Audio Deepfake Detection. arXiv preprint arXiv:2111.02813.

About

Project 2 and 3

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages