- Google Machine Learning Bootcamp 2024 Gemma Sprint Project
- Create a project outcome using Gemma2, which can be a model, code, app, or tutorial.
- Participate individually or as a team of up to 3 people with original project idea.
Video live streaming services are interactive media contents between streamers and viewers by live chats. As streamers are willing to manage chats smarter, intelligent chat bots are needed. This project is making live stream chat bots with Gemma2 2B model as AfreecaTV(SOOP) extension program in order to detect questions from live chats and answer them automatically, or check inappropriate texts and remove various types of spams.
Datasets are blended form public datasets below.
- Korean Hate Chat from Discord by Tanat : Kaggle Datasets
- Korean Unsmile Dataset by Smilegate : GitHub Repo
- Gemma2-2b-it model by Google : Hugging Face
- AfreecaTV Chat Crawler by Soohyun-Chae(cha2hyun) : GitHub Repo
- Question - Statement model by fine-tuned bert mini model by Shahrukh Khan : Hugging Face
- Compare the performance of all Korean profanity/cursive discriminator by Tanat : GitHub Repo