Predicting webtoon popularity with two multimodal fusion strategies: a business platform service perspective
This repository is to supplement the paper "Predicting webtoon popularity with two multimodal fusion strategies: a business platform service perspective".
Webtoon, a digital version of a cartoon, has become a prominent cultural phenomenon in South Korea. With the increasing number of mobile users, Webtoon platforms have become more accessible, allowing people to enjoy this culture regardless of their location and time. Interestingly, high-quality Webtoons have been adapted into other forms of media content, such as movies and dramas, following the concept of `one-source, multiuse'. Consequently, predicting the popularity of Webtoon content has become crucial for companies aiming to profit from various content markets. In the realm of multimedia, several scholars have explored multimodal classifiers for predicting popularity. Building upon the findings of prior research, we propose a multimodal classifier, which leverages images and text to assess the popularity of Webtoon content. In line with this, we collected and built the dataset, including webtoon thumbnails and stories, from two largest major Webtoon platforms in South Korea. The proposed model is developed and originated by an early-fusion architecture, combining long-short term memory framework with Ko-Sentence-Transformer embeddings and convolutional neural network. The model achieved impressive results, with a 94.17% F1-score and 96.82% accuracy in predicting the popularity of Webtoon content. The collected dataset is publicly available at https://github.com/dxlabskku/Webtoon-Popularity.
![earlyfusion_blurred](https://private-user-images.githubusercontent.com/43632309/262237041-425f6d74-7f19-43b8-9872-05296dffb28e.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk1NjA2NTIsIm5iZiI6MTczOTU2MDM1MiwicGF0aCI6Ii80MzYzMjMwOS8yNjIyMzcwNDEtNDI1ZjZkNzQtN2YxOS00M2I4LTk4NzItMDUyOTZkZmZiMjhlLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTQlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjE0VDE5MTIzMlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTI3MWM0ZWE0MWNmOGQ0OGZjMWNmOGJhZmM2ZWFkNTU1MDYwMmJkMjJlMjUyYTkzNmI2MTNjMWRkYjhmMWNlZjAmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.NO4NE5oFZE_y-zPQAKheJwEdOJIfGu79WNkbWvHbOhg)
Figure 1 : Proposed model
We constructed webtoon dataset collected from two major webtoon platforms in South Korea, Naver Webtoon and Kakao Webtoon. The dataset contains the title, story(synopsis), and thumbnail of each webtoon. The webtoon dataset includes titles, story(synopses), and thumbnail images for 4,770 launched webtoons and 11,931 challenge webtoons.
We proposed an early fusion multimodal model using image and text features. The image model is a Convolutional neural network(CNN) based on VGG16 scratch. And text model is a Long Short Term Memory(LSTM) by Ko-Sentence-Transformer embedding vector.
TBD