This project involves conducting an exploratory data analysis (EDA) on the Stack Overflow Developer Survey 2023 dataset. The goal is to clean, transform, and visualize the data and answer a few questions to uncover trends and insights within the developer community. The analysis was performed using Python libraries including Pandas, Matplotlib, and Seaborn.
The Stack Overflow Developer Survey 2023 dataset is used for this analysis. It includes attributes related to developer demographics, technology preferences, job satisfaction, and more. The dataset can be found here: https://survey.stackoverflow.co/
- Python
- Pandas
- Matplotlib
- Seaborn
-
Data Cleaning:
- Handled missing values
- Standardized data formats
- Removed duplicates
-
Data Transformation:
- Restructured data for better analysis
- Created new calculated fields where necessary
-
Data Wrangling:
- Filtered and selected relevant columns for analysis
-
Data Visualization:
- Qualifications: Does a master's degree give you any significant advantage?
- Coding Experience: How much does coding experience affect compensation?
- Average Salary: Who is the highest-paid employee in India?
- Working preferences: How much does remote working matter to employees?
The survey was conducted globally, but the EDA was done only with the respondents from India to answer more subjective and relevant questions.
Further EDA can be done for each specific country, or the entire dataset disregarding the country to answer a larger array of questions
The exploratory data analysis provided valuable insights into the developer community within India, highlighting key demographic trends, technology usage, and the correlation among these. These findings can inform stakeholders and guide future decisions in the tech industry.