Free E-Books


    • Python for Everybody (2016) is a gentle introduction to Python for beginners, complete with videos and course materials as well.
    • Python 101 (2019) is an online book that starts with Python’s basics but ramps up to more advanced topics.
    • Python Data Science Handbook (2016) is available on GitHub for free and includes both the text and accompanying Jupyter notebooks. The textbook walks you through the standard Data Science operations in Python, including using a notebook, manipulating data, visualizing data, and building some standard models.
    • Think Python, 2nd Edition (2015)

Probability, Statistics

    • Think Bayes (2012). Here, you’ll play with conditional probabilities and priors. This book contains Python applications.
    • Bayesian Methods for Hackers (2020). You will play with more advanced Bayesian algorithms such as multi-armed bandits and MCMC. This book contains Python applications.
    • Introduction to Probability, 2nd Edition (2019). This is the official textbook of Harvard’s Stat 110, which I had the honor of being a teaching fellow for. The second edition is available for free at net. Also, check out the companion Probability Cheatsheet.
    • OpenIntro Statistics, 4th Edition (2019). This is a high-quality and full textbook available for PWYW, and covering statistics topics all the way from some basics to some more advanced topics (like power calculations).
    • Think Stats, 2nd Edition (2014). Here, you’ll start off plotting and understanding distributions, and learning about hypothesis testing and regression. This book contains Python applications.

Machine Learning, Deep Learning, Data Mining

    • Mining of Massive Datasets, 3rd Edition (2020) is based on Stanford’s eponymous class and covers popular problems such as recommendation systems, PageRank, and social network analysis. Learn more about the book and the class at
    • Machine Learning Yearning (2018) by Andrew Ng is aimed at practical considerations for people developing ML systems. The book isn’t too technical but is best read after you’ve played around with some ML projects of your own.
    • Deep Learning (2016) is written by some of the pioneers of the field but will get quite heavy on math. An HTML version of the book is available for free from their website.

Practicing Data Science


Blogs & Articles


Cheat Sheets

    • Useful¬†Cheat Sheets (Pandas, Regex, Python for data science, Numpy)