- kaggle.com/competitions
- UCI Machine Learning Repository, "Most Popular Data Sets"
- scikit-learn Toy Datasets
- kaggle.com/datasets
- 18 places to find data sets for data science projects
- 100+ Interesting Data Sets for Statistics
- Data Is Plural archive
- Likelihood of a flight getting cancelled
- Do Rich People Take More Taxis?
- Machine Learning & Food Classification
- Personal Website
- Teaching computers to play bluegrass banjo, with context-free grammars
- Get data from an image of a chart, using Python & OpenCV
- First submission to Kaggle Instacart competition
- The Beautiful Game - Analysis of Football Events
- Bootcamp Student Responds to Net Neutrality Project Going Viral
- Why your relationship is likely to last (or not): using Local Interpretable Model-Agnostic Explanations (LIME)
- Sales Analytics: How to Use Machine Learning to Predict and Optimize Product Backorders
- The Pudding visual essays
- "25 incredible speakers demystify data science, and discuss the training, the tools, and the career path to the best job in the United States."
- "This series of videos presents a case study in how I personally approach reproducible data analysis within the Jupyter notebook."
- Imposter Syndrome by Brandon Rohrer, Data Scientist at Facebook, formerly Microsoft
- Imposter Syndrome in Data Science by Caitlin Hudon, Data Scientist at Web.com, co-founder of R-Ladies Austin
- Effort Shock and Reward Shock
- Khan Academy: You Can Learn Anything (Growth Mindset)
- Jupyter demo videos
- try.jupyter.org
- Download Anaconda Distribution
- Jupyter Notebook Best Practices for Data Science ("Lab Notebooks" vs "Deliverable Notebooks")
- Documentation
- Official cheat sheet
- DataSchool.io videos
- Python Data Science Handbook, Chapter 3
- Modern Pandas
- PythonPlot.com. "Interactive comparison of Python plotting libraries for exploratory data analysis. Examples of using Pandas plotting, plotnine, Seaborn, and Matplotlib."
- Matplotlib documentation
- Seaborn documentation
- Strong Titles Are The Biggest Bang for Your Buck
- How to Generate FiveThirtyEight Graphs in Python
- Python Data Science Handbook, Chapter 4
- Statistics Without the Agonizing Pain – 12 minute video
- Statistics for Hackers – 33 minute video & slides
- xkcd comics: Significant, P-Values, more
- 538: Not Even Scientists Can Easily Explain P-values, interactive p-hacking demo
- Think Stats, Chapter 9
- How (and why) to create a good validation set
- Cross-Validation Gone Wrong
- Python Data Science Handbook, Chapter 5.3
- Statistical Modeling: The Two Cultures, Sections 1-7
- Causal inference in economics and marketing
- Interpretable machine learning
- When to Act on a Correlation, and When Not To
- Statistical Modeling: The Two Cultures, Sections 8-12
- Crestle.com. "Effortless infrastructure for deep learning." Jupyter Notebook in the cloud with GPU, $0.59/hour.
- Keras tutorials
- Notebooks for Deep Learning with Python