-
-
Notifications
You must be signed in to change notification settings - Fork 368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ChatGPT Sentiment Analysis #431
Conversation
Our team will soon review your PR. Thanks @JIGYASAKARAKOTI :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- CNN and LSTM looks good to me, but replace the other machine learning models with deep learning models such as MobileNet, VGG and so on.
- In the Dataset folder, update the file name from
Dataset Link
toREADME.md
and put the dataset link there. - Add the EDA results/outputs in the
README.md
file along with the accuracy metrices generated from the project.
can i use VADER Model and Twitter-roBERTa-base? |
Yes, you can. |
…_SA_using_DL_updated.ipynb
…_SA using DL.ipynb
…hatgpt sentiment analysis.ipynb
… tweet chatgpt sentiment analysis.ipynb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved.
@JIGYASAKARAKOTI
Pull Request for DL-Simplified 💡
Issue Title : ChatGPT Sentiment Analysis
Closes: #411
Describe the add-ons or changes you've made 📃
The code implements sentiment analysis on Twitter data using various machine learning models, including Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), Logistic Regression, Naive Bayes, and Random Forest. The dataset is loaded, explored, cleaned, and preprocessed before being used to train and evaluate the models.
Key points:
-> Data Exploration and Cleaning:
The initial exploration revealed the distribution of sentiment labels in the Twitter dataset.
Data cleaning involved removing duplicates, links, special characters, and stopwords.
Balancing the data was performed to address class imbalance.
-> Text Data Preprocessing: Tokenization and lemmatization were applied to convert text data into a suitable format for machine learning models.
Word clouds were used to visualize the most frequent words before and after cleaning.
-> Model Training and Evaluation:
The code implemented models such as CNN, LSTM, Logistic Regression, Naive Bayes, and Random Forest for sentiment analysis.
Training and testing sets were created, and models were trained on the training set and evaluated on the testing set.
Various metrics, including accuracy, recall, precision, and F1 score, were calculated to assess model performance.
-> Visualization:
Visualizations, such as bar charts, pie charts, word clouds, and ROC curves, were used to gain insights into data distribution, model performance, and the impact of data preprocessing.
-> User Interaction:
The code includes an interactive function allowing users to input sentences for sentiment prediction, demonstrating the practical use of the trained models.
-> Model Comparison:
By training and evaluating multiple models, the code provides a comparative analysis of their performance on sentiment analysis tasks.
Type of change ☑️
What sort of change have you made:
How Has This Been Tested? ⚙️
Describe how it has been tested
Describe how have you verified the changes made
Checklist: ☑️