This project performs text mining and data analysis on the TRC Victims dataset to uncover insights related to victim demographics and associated information. It combines natural language processing (NLP) techniques with data visualization to interpret key findings.
- Source: TRC Victims dataset
- Total Rows: 21,747
- Columns: name, age, description, url
- Conduct text mining on victim-related data to extract meaningful patterns and trends.
- Perform data cleaning and preprocessing to handle inconsistencies and prepare for analysis.
- Use visualizations to highlight trends and insights.
- Derive actionable findings from both text and structured data.
- PIL
- matplotlib.pyplot
- nltk
- numpy
- os
- pandas
- seaborn
- sklearn.decomposition
- sklearn.feature_extraction.text
- sklearn.pipeline
- tf_keras
- tidytext
- top2vec
- wordcloud
- Cleaned and preprocessed dataset ready for analysis.
- NLP and text mining insights to complement structured data findings.
- Visualizations that highlight trends and key data points.