Data science is a hugely growing field.
That means there is an incredible amount of material out there about it, from non-technical guides for the casually interested, all the way to in-depth technical books and articles. There is also lots of “aggregate content” – collections of books and articles to help you find the right handful of resources, rather than be overwhelmed by all that’s out there.
In today’s post I thought I’d add my own recommendations to the ether. These are strictly books, because I find it’s nice to do some learning away from the screen in book form every now and again. I’ve tried to go for a mix of technical and non-technical, so there’s something for everyone.
These aren’t in any particular order, so it’s not a Top 5, just a… 5.
I have to say I was skeptical about reading a book simply entitled “Big Data” because of the meaningless hype that comes with that phrase. I was pleasantly surprised of course, otherwise the book wouldn’t be on the list. This is the least technical and complex book on the list, and it is aimed at a much wider audience. The authors highlight the power that enormous datasets possess, as well as covering difficult issues such as data privacy.
It might feel “too easy” to include this one, but it’s also too good to leave out. Nate Silver and his prediction skills are very topical what with the US election happening as we speak. His website FiveThirtyEight is arguably the number one source for election forecasts.
I like the breadth of topics covered in this book. Silver tackles the difficulties, failures, and successes of forecasting the weather, baseball, earthquakes and the economy. The book is very dense with information, so it’s definitely not a light read, but it’s very much recommended for anyone with more than a passing interest in forecasting with data.
Not a book I ever expected to see, but this one teaches you data science fundamentals in Excel. This actually makes a lot of sense, because that way you can focus on the concepts rather than the shiny tools that are available. The assumption is that everyone knows how to use Excel (although it turns out there was a lot I didn’t know!) so the barrier for entry is near zero.
I like a well-written book, even if it’s very technical, and this one ticks that box too.
This was the first hands-on data science book I bought and I got lucky because it’s a great way to start. It assumes no previous knowledge of data science or programming, although even a small previous background in Python will be helpful in getting off the ground. It teaches data science concepts from first principles, with Python implementations that avoid the use of in-built libraries such as scikit-learn.
As I mentioned in a previous post, I’m a big fan of learning intuitions first and details later. This book does exactly that; it starts certain chapters with a real world data science task and then walks you through the solution. Once you’ve solved the problem using raw Python, Grus suggests the appropriate libraries to use in the real world.
On top of that, a large chunk of the book deals with the things you should know before diving into data science, such as statistics and probability.
An all round valuable resource, and an enjoyable read.
The title speaks for itself. This one is a technical dive into machine learning using Python. Concepts are all explained very well by an author whose website is a treasure trove of information about machine learning and data science.
It’s always nice to see a book go beyond describing the algorithms – there’s additional material around deploying your machine learning solutions to the web.
Python Machine Learning will remain a reference guide for me for a long time.
There’s my two cents and I’m always on the look out for more recommendations, so send any my way!
Footnote: This is the 8th entry in my 30 day blog challenge.