Racial Bias Detection in News Articles

Project Overview

This project is a culmination of our Milestone Project for the Master of Applied Data Science program at the University of Michigan. Our team employed advanced Natural Language Processing (NLP) techniques to detect and quantify racial bias in news articles. The goal of this project is to contribute to fair and inclusive reporting practices by providing a data-driven analysis of how news articles discuss and portray racial dynamics.

By leveraging both supervised and unsupervised machine learning methods, we aim to shed light on the subtle and overt ways racial bias can manifest in media narratives.

Data Source

The primary data source is the "Navigating News Narratives: A Media Bias Analysis Dataset" from Zenodo.org. The dataset contains over 3.7 million rows, from which a subset of 41,769 records related to racial bias was selected for analysis.

Key Features

Text preprocessing and feature engineering
Supervised learning models: BERT, LSTM, SVM, SGD
Unsupervised learning models: LDA for topic modeling, K-means for clustering
Visualization techniques: word clouds, parallel coordinates plots, silhouette plots

Main Findings

Supervised Learning:
- BERT model achieved the highest performance with an F1 score of 0.84 ± 0.0154
- Feature importance analysis revealed key words indicative of bias
- Sensitivity analysis showed model stability across different hyperparameters
Unsupervised Learning:
- LDA identified 5 distinct topics related to racial discussions
- K-means clustering determined 17 clusters as optimal
- Visualization techniques like PCA and t-SNE helped in understanding cluster distributions

Tools Used

Python
Pandas
NumPy
Matplotlib
Seaborn
Scikit-learn
Gensim
NLTK
TensorFlow/Keras

Future Work

Fine-tune the BERT model for improved performance
Explore advanced feature engineering techniques
Analyze a larger dataset for more robust results

Ethical Considerations

The project addresses potential biases in data and model interpretations, emphasizing the importance of responsible data analysis when dealing with sensitive topics like racial bias.

Team Information

Amanda Fear, Ejaz Alam, Nikolay Jamgaryan

Acknowledgments

Zenodo.org for hosting the "Navigating News Narratives: A Media Bias Analysis Dataset"

For a detailed analysis, please refer to the full project report here.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Data		Data
Notebooks		Notebooks
Project Report		Project Report
Results		Results
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Racial Bias Detection in News Articles

Project Overview

Data Source

Key Features

Main Findings

Tools Used

Future Work

Ethical Considerations

Team Information

Acknowledgments

About

Languages

License

ejazalam831/racial-bias-detection-using-nlp

Folders and files

Latest commit

History

Repository files navigation

Racial Bias Detection in News Articles

Project Overview

Data Source

Key Features

Main Findings

Tools Used

Future Work

Ethical Considerations

Team Information

Acknowledgments

About

Topics

Resources

License

Stars

Watchers

Forks

Languages