data science resources
A short collection of resources I've found most useful for learning and revisiting python programming, machine learning, and data visualization.
Table of Contents
- Python Programming
- Machine Learning
- Machine Learning and AI News and Developments
- Data Visualization
- Computer science/software engineering
- Git
Python Programming
Books
- Effective Python: 90 Specific Ways to Write Better Python, 2nd edition; Brett Slatkin
- Strongly recommend for all levels of Python ability
- Breaking down of Python into different chapters, then with the takeaway for the section as the title of the section, makes it easy to locate what you want
- Each section has a motivating narrative, with a scenario, a set of improving code implementations, and a summary of practical takeaway messages making the tips very impactful
- Python Tricks: The Book; Dan Bader
- Closely behind Effective Python in my recommendations
- Similarly, breaks down sections by themes, with code to emphasize the points
Online Resources
- Real Python
- Pages on specific topics, for example, object-oriented programming, decorators, functional programming
- Structure of pages means that you can often quickly find an example of what you want an answer to, but can hang around a bit longer to have a more in-depth look into the topic in question
- Can subscribe to Real Python to get access to further materials, including courses consisting of different materials
- DataCamp
- Has good courses, but have to pay for them or get access through employers
- Combination of videos, text, and exercises makes courses stimulating
- Covers Python programming, but also has data science-focused courses, for example, data visualization and machine learning
- A good option for learning Python, but courses may not be what intermediate or advanced Python users are looking for, especially for topics around machine learning
Podcasts
- Python Bytes
- Translating Python discussion into an audio medium is challenging, but this light-hearted podcast is a great listen for packages, techniques, and opinions on Python
- Contains some information on web development that might not be of interest to data scientists, but does give a break from the oft-repeated pandas/numpy/sklearn guidance.
Machine Learning
Books
- Machine Learning with PyTorch and Scikit-Learn; Sebastian Raschka
- Comprehensive guide to using sklearn and PyTorch, with derivation of techniques, in-depth code examples, and detailed justification
- Most recent edition gives details of up-to-date techniques like reinforcement learning, GANs, and transformers
- An Introduction to Statistical Learning, with Applications in Python; Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, Jonathan Taylor
- Version of previous book which only had examples in R
- More in-depth focus on statistics than other books, definitely a good book for data scientists looking to strengthen their statistics knowledge
Machine Learning and AI News and Developments
Online Resources
- Ahead of AI; Sebastian Raschka
- Articles distill key information and ideas from AI publications
- Articles include monthly summaries of key publications, significant new ideas in machine learning, and practical tips for applying new developments
Podcasts
- The AI Breakdown; Nathaniel Whittemore
- High-level podcast about latest stories and opinion pieces in AI
Data Visualization
Online Resources
- Matplotlib Quick Start Guide
- Guide to getting started with mpl, often skipped whenever visualizing with mpl
- Describes usage of the two paradigms in mpl - the explicit object-oriented style and the functional pyplot style. The distinction and usage between these two is a very common source of confusion for users
- Python Graph Gallery
- Want to make a plot in Python but can’t remember which of matplotlib/seaborn/pyplot is best for this and how you get started? This is the website for you
- Grouped by (many) types of plots, with code examples for all the libraries that provide options for this
Computer science/software engineering
Online Resources
- The Missing Semester of Your CS Education: ./missing-semester
- A course that covers the tools programmers will use the most “as fluid and frictionless as possible”, including shell tools (bash), editors (Vim), and version control (Git)
- I would recommend that anyone and everyone who programs takes the time out to do this course. This is the course I wish had been the first thing I had done at the start of my PhD
Git
Git undo flowchart
From repeated searching of the correct git command to undo some changes in a code repository, I have compiled a flowchart for finding the correct command that can be found here.