9 Data Science-related books to ask Santa for Christmas

Source: Deep Learning on Medium

This week I was going to upload a story about the importance of doing a proper encoding of categorical variables, but given that we’re very very very close to Christmas I thought it could be a good idea to drop some lines about books I’ve found during the year and that me myself would have enjoyed receiving as a gift.

The recommendations are going to be divided according to the following categories:

  • Non-technical books
  • Technical books
  • Data visualization books

You’ll find 3 of each, and hopefully, I’ll surprise you with some.

Non-technical books

Naked Statistics — Rating: 4 out of 5

Even though some chapters can result too basic, Charles Wheelan walks us through lots of stuff we usually use and mention in our everyday life, showcasing classical mistakes and good practices around all of those concepts. Also interesting for getting some real-life examples of technical concepts if you’re studying and not yet working in the field.

A bit more about it:

Once considered tedious, the field of statistics is rapidly evolving into a discipline Hal Varian, chief economist at Google, has actually called “sexy.” From batting averages and political polls to game shows and medical research, the real-world application of statistics continues to grow by leaps and bounds. How can we catch schools that cheat on standardized tests? How does Netflix know which movies you’ll like? What is causing the rising incidence of autism? As best-selling author Charles Wheelan shows us in Naked Statistics, the right data and a few well-chosen statistical tools can help us answer these questions and more.

Source: http://goodreads.com/

Algorithms to Live By — Rating: 4.5 out of 5

A bit slow at the beginning, and perhaps even a little repetitive at times, but with some truly amazing chapters afterwards, this is a book that you’ll probably have to read a couple of times to fully process. In my case, it gave me tools for my everyday life and work, as well as questions and intrigue about stuff I further read after finishing the book.

A bit more about it:

All our lives are constrained by limited space and time, limits that give rise to a particular set of problems. What should we do, or leave undone, in a day or a lifetime? How much messiness should we accept? What balance of new activities and familiar favourites is the most fulfilling? These may seem like uniquely human quandaries, but they are not: computers, too, face the same constraints, so computer scientists have been grappling with their version of such issues for decades. And the solutions they’ve found have much to teach us.

Source: http://goodreads.com

Invisible Women: Data Bias in a World Designed for Men — Rating: 5 out of 5

As a friend of mine said to me: Don’t be put off because it’s “about women” because that kind of means it’s about men too anyway! I haven’t finished this book yet. But so far it has shown to be amazing. The research done by Caroline Criado-Pérez is truly impressive and it exposes real facts about an issue we all should be dealing with.

A bit more about it:

Imagine a world where your phone is too big for your hand, where your doctor prescribes a drug that is wrong for your body, where in a car accident you are 47% more likely to be seriously injured, where every week the countless hours of work you do are not recognised or valued. If any of this sounds familiar, chances are that you’re a woman.

Invisible Women shows us how, in a world largely built for and by men, we are systematically ignoring half the population. It exposes the gender data gap — a gap in our knowledge that is at the root of perpetual, systemic discrimination against women, and that has created a pervasive but invisible bias with a profound effect on women’s lives.

Source: http://goodreads.com

Bonus track: in case you don’t want to read the entire book, or perhaps you just want a sneak peek, another friend of mine recommended me an episode from the 99% invisible podcast where they interviewed the author.

Technical books

Deep Learning with Python — Rating: 4 out of 5

Love Python? Want to learn more about deep learning? Francois Chollet is the author of Keras, one of the most widely used libraries. This book goes from the intuition behind, to the practical application of concepts, showing examples and making enjoyable for the reader.

A bit more about it:

Deep learning is applicable to a widening range of artificial intelligence problems, such as image classification, speech recognition, text classification, question answering, text-to-speech, and optical character recognition. It is the technology behind photo tagging systems at Facebook and Google, self-driving cars, speech recognition systems on your smartphone, and much more. In particular, Deep learning excels at solving machine perception problems: understanding the content of image data, video data, or sound data. Here’s a simple example: say you have a large collection of images, and that you want tags associated with each image, for example, “dog,” “cat,” etc. Deep learning can allow you to create a system that understands how to map such tags to images, learning only from examples. This system can then be applied to new images, automating the task of photo tagging. A deep learning model only has to be fed examples of a task to start generating useful results on new data.

Source: http://goodreads.com

The Hundred-Page Machine Learning Book — Rating 5 out of 5

I started reading this book online by pieces, each time I needed to better understand a concept but found it so good, clear and helpful, that recently decided to buy the printed edition to have it at home. It goes through all the most important concepts and algorithms in Machine Learning. Explaining how they work in both, a technical and non-technical way.

A bit more about it:

Supervised and unsupervised learning, support vector machines, neural networks, ensemble methods, gradient descent, cluster analysis and dimensionality reduction, autoencoders and transfer learning, feature engineering and hyperparameter tuning! Math, intuition, illustrations, all in just a hundred pages!

Source: http://themlbook.com/

Bonus track: the book has its own webpage, where you can find different format options, as well as reviews and comments from different people.

Probability: For the Enthusiastic Beginner — Rating: 4 out of 5

Feeling a bit rusty about your probability knowledge and skills? Then this is the book for you. It goes through all the most important topics in a very friendly way, full of examples and worked-out problems. Revisit concepts such as combinatorics, the rules of probability, Bayes’ theorem, expectation value, variance, probability density, common distributions, the law of large numbers, the central limit theorem, correlation, and regression. There’s not much else you actually need to know about this book. A handy book to have close to you at home.

Data visualization books

I found data visualization such an important part of Data Science, that some time ago I published a story called: ’10 tips to improve your plotting. Because in real-life data science, plotting does matter’. In that publication, I wrote a bit about why I think data visualization is so important, but if you’re interested in reading more about it and improve your understanding and skills, then surely the next books will help you accomplish that.

Good Charts — Rating: 4.5 out of 5

As The Hundred-Page Machine Learning Book, I found first this book online, read some chapters and decided to buy it to have at home. A great addition to any library, not only because of being a beautifully designed book but also since it is a very enjoyable book to read.

A bit more about it:

A good visualization can communicate the nature and potential impact of information and ideas more powerfully than any other form of communication. (…) building good charts is quickly becoming a need-to-have skill for managers. If you’re not doing it, other managers are, and they’re getting noticed for it and getting credit for contributing to your company’s success. In Good Charts, dataviz maven Scott Berinato provides an essential guide to how visualization works and how to use this new language to impress and persuade. Dataviz today is where spreadsheets and word processors were in the early 1980s — on the cusp of changing how we work. Berinato lays out a system for thinking visually and building better charts through a process of talking, sketching, and prototyping.

Source: http://goodreads.com

The Visual Display of Quantitative Information — Rating: 4 out of 5

Not as appealing to the eye as Good Charts, but extremely rich in its content. Published for the first time in 1983, this book has been kind of the bible of data visualization for a while. Though it’s true that nowadays some things within the book are obsolete or outdated, it published for the first time concepts and tools you’re probably using today.

A bit more about it:

The classic book on statistical graphics, charts, tables. Theory and practice in the design of data graphics, 250 illustrations of the best (and a few of the worst) statistical graphics, with detailed analysis of how to display data for precise, effective, quick analysis. Design of the high-resolution displays, small multiples. Editing and improving graphics. The data-ink ratio. Time-series, relational graphics, data maps, multivariate designs. Detection of graphical deception: design variation vs. data variation. Sources of deception. Aesthetics and data graphical displays.

Source: http://goodreads.com

Information is Beautiful — Rating: 4.5 out of 5

This is a recommendation given by a friend of mine, so even though I can’t give you my opinion about the book, I’m leaving its full description below:

Every day, every hour, every minute we are bombarded with information, from television, from newspapers, from the Internet, we’re steeped in it. We need a way to relate to it. Enter David McCandless and his stunning infographics, simple, elegant ways to interact with information too complex or abstract to grasp any way but visually. McCandless creates visually stunning displays that blend the facts with their connections, contexts, and relationships, making information meaningful, entertaining, and beautiful. And his genius is as much in finding fresh ways to provocatively combine datasets as it is in finding new ways to show the results.

Knowledge is Beautiful is a fascinating spin through the world of visualized data, all of it bearing the hallmark of David McCandless’s boundary-breaking, signature style. The captivating follow-up to the bestseller The Visual Miscellaneum, Knowledge is Beautiful offers a deeper, more ranging look at the world and its history, with more connectivity between the pages, a greater exploration of causes and consequences, and a more inclusive global outlook. With a portion of its content crowd-sourced from McCandless’s international following, Knowledge is Beautiful achieves a revolutionary and democratic look at the key issues from questions on history and politics, the facts of science, streams of literature, and much more.

Source: http://goodreads.com

Well, I think that’s enough, at least for Christmas. If you do buy and read any of them, or if you already did it in the past, please leave me your own review. I would love to hear your thoughts!

And if you enjoyed this story, check some others of mine like 6 amateur mistakes I’ve made working with train-test splits or Web scraping in 5 minutes. All of them available in my profile.

See you in Medium!