Skip to content

Implement binary classification model to detect spam or non-spam email using machine learning algorithms.

License

Notifications You must be signed in to change notification settings

RuttonSarker/E-mail-Spam-Detection-Using-Machine-Learning-Algorithms

Repository files navigation

E-mail Spam Detection Using Machine Learning Algorithms.

Developed a binary classification model to determine spam or non-spam email, analyzed the dataset, and visualized the top high-frequency spam and non-spam words with the WordCloud Python library. Measured the performance of classification algorithms with evaluation metrics such as accuracy, precision, confusion matrix, and f1-score. Also, performance was improved by using a voting classifier.

Library Used:

  • NumPy
  • Pandas
  • matplotlib
  • Seaborn
  • nltk
  • WordCloud
  • Scikit-learn

ML Algorithms:

  • Naive Bayes (Gaussian, Bernoulli, Multinomial)
  • SVM
  • Decision Tree
  • Random Forest
  • XGBoost

Data Analysis

  • Spam and Ham e-mail ratio in dataset

Dataset_Spam_Ratio

  • Number of Characters, Words, Sentences (Spam and Non Spam e-mail)

Number_words_char_Sentence

  • Correlation Matrix Graphical Representation

heatmap

  • Spam Wordcloud Representation

spam_wordcloud

  • Top 50 Spam e-mail words

top_words_spam

  • Top 50 Ham e-mail words

top_words_ham

Classification Model Accuracy, Precision, F1-score

total

Confusion Matrix Graphical Representation

confusion

About

Implement binary classification model to detect spam or non-spam email using machine learning algorithms.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published