Skip to content

Latest commit

 

History

History
35 lines (25 loc) · 2.03 KB

README.md

File metadata and controls

35 lines (25 loc) · 2.03 KB

UN SDG data analysis

Capstone project for the HarvardX Data Science program (completed in 2019). All data analysis was done in R.

Overview

This project seeks to explore the global progress towards the United Nations Sustainable Development Goals (UN SDG). The main dataset is the UN SDG Indicators from the UN Statistics Division. It is complimented by data from the Gapminder Foundation and the World Bank. The maps and mapdata R packages are used to visualize the reported progress for each country on a world map.

Goals

  • Programmatically download data
  • Clean and merge data into unified datasets
  • Communicate results graphically
  • Use trend analysis to predict future results
Data Source
UN Sustainable Development Goals Indicators bigrquery (R interface to BigQuery)
Gapminder geography data googlesheets
World Bank population data wbstats (R interface to World Bank API)
Spatial map data maps and mapdata

Process

The analysis process was broken down into:

  1. Downloading data
  2. Cleaning and merging data
  3. Exploring and visualizing data
  4. Forecasting

Details and results

See the PDF report for details and results.

Example visualization

GIF: Population with access to electricity