The last decade has seen artificial intelligence deliver impressive results in areas ranging from image detection, automated driving, powering recommendations on e-commerce platforms and other tasks.

Artificial intelligence is a broad category which includes machine learning, whereas deep learning is in turn a subset of machine learning.

Deep learning is based on neural network algorithms that were inspired by the functioning of neurons in human brains, although the latter are much more complex. Typical artificial neural net consists of input, output layers and one or more hidden layers in between. Each layer consists of units, often called neurons, that transform input data to the layer to the output data that is used by the neurons in the next layer. “Deep” in the term deep learning refers to the typically large number of hidden layers in deep neural nets.

Although an important way of gaining knowledge about the deep learning are online courses, the books still play one of the key methods of learning about this field. They help beginner to gain a foothold in this area of artificial intelligence, whereas seasoned practitioners can deepen their existing knowledge.

We are presenting a list of excellent deep learning books for 2020, followed by more in-depth discussion of each book:

  • Neural networks and deep learning, by Michael Nielsen,
  • Deep learning with python, by Francois Chollet,
  • Deep Learning (Adaptive Computation and Machine Learning series), by Ian Goodfellow, Yoshua Bengio, Aaron Courville,
  • Deep Learning: A Practitioner’s Approach, by Josh Patterson, Adam Gibson,
  • Machine Learning Yearning, by Andrew Ng,
  • Hands-On Machine Learning with Scikit-Learn and TensorFlow, by Aurélien Géron,
  • Interpretable Machine Learning, by Christoph Molnar,
  • The Hundred-Page Machine Learning Book, by Andriy Burkov.

 

Neural Networks and Deep Learning

About the book:

Neural Networks and Deep Learning book differentiates itself from other deep learning books by helping readers to understand the core principles of the neural networks and deep learning. This is achieved through several detailed code examples, such as recognition of handwritten digits. The book teaches its readers about topics of deep learning which played an important role in historical evolution of deep learning, such as vanishing gradient problem and the working of the backpropagation algorithm, all from first principles. It is recommended to all who want to learn more about the neural networks and the intuition behind them, all presented in a step-by-step fashion with specific, illustrative examples.

Author:

Michael Nielsen.

Twitter: https://twitter.com/michael_nielsen

Author’s website: http://neuralnetworksanddeeplearning.com/about.html

Where to buy the book:

The book is a free online book, available at: http://neuralnetworksanddeeplearning.com/index.html

Additional material:

Code samples from the book are available at https://github.com/mnielsen/neural-networks-and-deep-learning.

Difficulty:

Medium

 

Deep Learning with Python

About the book:

Author of the book approaches presentation of deep learning topics through many useful and interesting applications of deep learning methods. The book starts with a brief history of deep learning, a welcome topic for beginners in the field. Focus of next chapter are the mathematical building blocks of neural networks – scalars, vectors, matrices, 3D and higher dimensional tensors. It introduces reader to various operations that can be performed on tensors – from element-wise ones, broadcasting, reshaping to explaining geometric interpretations of tensor operations. As the book predominantly uses Keras library for code examples, it is especially interesting for those interested in learning more about Keras through code case studies. Author has included detailed chapters on computer vision tasks, as well as deep learning models applied on text and sequences. Chapter on generative learning includes examples of text generation with LSTM, implementation of DeepDream in Keras, Neural Style Transfer, generating images with variational autoencoders and introduction to generative adversarial networks.

Author:

Francois Chollet is a Google Software Engineer and the creator of the widely known Keras, an open source deep learning library.

Twitter: https://twitter.com/fchollet

Website: https://fchollet.com/

Where to buy the book:

The book is available at Amazon.

Additional material:

Companion Jupyter notebooks with code samples from the book are available in github repository:

https://github.com/fchollet/deep-learning-with-python-notebooks

Difficulty:

High

Deep Learning (Adaptive Computation and Machine Learning series)

About the book:

Deep learning book is divided into three major parts. Part 1 focuses on introducing you to the main underlying mathematical concepts needed for the remaining parts of the book – linear algebra, probability and information theory, numerical computation and machine learning basics.

The second part is the main part of the book, introducing deep feedforward networks, how to resolve problems with overfitting in deep learning through regularization, followed by chapters on optimization for training deep learning models, convolutional networks (especially interesting for computer vision tasks), recurrent and recursive nets for problems with data in form of sequences (e.g. neural machine translation) and practical methodology and applications.

In its third part, the books presents several advanced topics – linear factor models, autoencoders, representation learning, Monte Carlo methods and deep generative models.

Authors:

Ian Goodfellow, Yoshua Bengio, Aaron Courville .

Where to buy the book:

The book is available at Amazon.

Additional material:

The lecture slides are available at http://www.deeplearningbook.org/lecture_slides.html.

Difficulty:

High

Deep Learning: A Practitioner’s Approach

About the book:

First part of the book focuses on theory and fundamentals to provide the reader a foundation for the second part of the book. Chapter 1 reviews key machine and deep learning topics, beginners can use it to refresh their knowledge before delving into more complex topics in later chapters. Chapters 2 to 4 introduce foundations of the neural networks, including four major architectures of deep neural networks. Chapters 6 and 7 are valuable for those interesting in tuning deep neural networks as they are agnostic with respect to the platforms used. Otherwise, the book mainly uses D4LJ implementation for code examples. Chapter 8 has a useful introduction to vectorization methods, followed by chapter on the use of DL4J on Spark and Hadoop.

Author:

Josh Patterson, Adam Gibson

Where to buy the book:

The book is available at Amazon.

Additional material:

Supplemental material (data, scripts, command line tools) are available for download at: https://github.com/deeplearning4j/oreilly-book-dl4j-examples.

Difficulty:

High

Machine Learning Yearning

About the book:

The focus of the book is not teaching you specific machine learning algorithms or provide code examples for particular ML applications – there is no code and almost no mathematical formulas in the book. Rather, the book is intended for those who have experience in developing machine learning models and helps them in improving their knowledge in areas such as ML design, setting up development and test sets, error analysis in ML work and debugging. The book, unlike most others, has a large number of chapters – 58, with most of them just one or two pages long. Author states that the chapters are deliberately short, so “you can print them out and get your teammates to read just the 1-2 pages you need them to know.”

Some of the more interesting sets of chapters include:

  • designing train, dev and test sets (chapters 5-7, 10-12)
  • optimization metrics (chapters 8-10)
  • error analysis (14-19, 26)
  • discussion of bias and variance (chapters 20-32)
  • analysing human-level performance (33-35)

 

Author:

Andrew Ng is VP & Chief Scientist of Baidu, Co-Founder of Coursera and an Adjunct Professor at Stanford University.

Where to buy the book:

The book is available at https://www.deeplearning.ai/machine-learning-yearning/.

Difficulty:

Medium

Hands-On Machine Learning with Scikit-Learn and TensorFlow

About the book:

Hands-On Machine Learning with Scikit-learn and TensorFlow is one of the best books for both beginners as well as experienced machine learning practitioners. It covers all the major machine learning algorithms, from linear regression, logistic regression, support vector machines, decision trees to random forests. All models are introduced in a comprehensive way, with advanced topics covered as well. Book contains several applications of ML models on specific examples, together with complete code. Besides machine learning models it also introduces the reader to key concepts in machine learning, from building train/test data sets to metrics for evaluation of machine learning model performance (including precision, recall, f1 score) and confusion matrix, in case of classification problems.

The second part of the book starts with introducing Tensorflow, followed by discussion of artificial neural networks, training of deep neural nets, convolutional neural networks, recurrent neural networks, autoencoders and reinforcement learning.

Author:

Aurélien Géron.

Twitter: https://twitter.com/aureliengeron.

Where to buy the book:

The book is available at Amazon.

Additional material:

Example code and solutions to exercises in the book Hands-On Machine Learning with Scikit-learn and TensorFlow are available at https://github.com/ageron/handson-ml.

Difficulty:

High

 

Interpretable Machine Learning

About the book:

Machine learning and artificial intelligence are becoming an important part of our lives and decisions affecting us. With most AI algorithms behaving as black boxes, the interpretability of decisions is however not a given anymore. This has led to increased demands for interpretability of machine learning models, both by consumers and regulators.  “Right to Explanation” has e.g. become one of the key parts of the GDPR regulation. This book introduces interpretable machine learning or explainable artificial intelligence (XAI) in great detail and is a welcome addition to the machine learning and deep learning field.

First chapters introduce us to the concept and importance of ML interpretability, together with general information about the characteristics of human explanations which should serve as a guide for ML models. This is followed by discussion of those ML models that are highly interpretable and include linear regression, logistic regression and decision trees.

Interpretation of more complex machine learning decisions relies on so-called model agnostic models, which do not depend on the specific design of the machine learning model. Author discusses important XAI models, including Partial Dependence Plot (PDP), Individual Conditional Expectation (ICE) Plots, permutation feature importance method, local surrogate models (LIME) and use of Shapley values for interpreting ML model decisions (SHAP method).

Author:

Christoph Molnar.

Where to buy the book:

The book is available for free at https://christophm.github.io/interpretable-ml-book/r-packages-used-for-examples.html.

Additional material:

Code and text of the book is available at https://github.com/christophM/interpretable-ml-book.

Difficulty:

Medium

 

The Hundred-Page Machine Learning Book

About the book:

As the title of the book suggests, the goal of the author was to condense the machine learning knowledge and present it in 100 pages, which is less than most of the other comparable books. Even though the current version has grown to 160 pages, the book provides its reader with a lot of valuable content on a wide range of topics and is interesting both for beginners as well as seasoned machine learning practitioners.

Chapter 1 starts with introduction to the main types of machine learning, Chapter 2 is a useful introduction to the important math building blocks of machine learning. Chapter 3 discusses standard algorithms, such as linear regression, logistic regression, decision trees, support vector machines and k-nearest neighbors. Chapter 4 discusses main concepts of the machine learning algorithms, followed by introduction to feature engineering, one hot encoding, binning, normalization, dealing with missing features, underfitting and overfitting, regularization and ML model performance metrics and concepts, such as precision, recall, f1 score, confusion matrix. The author also introduces the reader to the hyperparameter tuning and cross-validation.

Chapter 6 is focused on neural networks and deep learning, with background information on two key neural networks – convolutional neural networks and recurrent neural networks. Unsupervised learning is the topic of chapter 9, introducing k-means clustering, DBSCAN and HDBSCAN, dimensionality reduction methods such as principal component analysis and others.

Author:

Andriy Burkov.

Where to buy the book:

The book is available on Amazon.

Additional material:

Code from the book is available on https://github.com/aburkov/theMLbook.

Difficulty: Medium

Python Machine Learning: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2, 3rd Edition

About the book:

Book by Sebastian Raschka, now in its third edition, is suitable both for beginners as well as advanced readers, with many example data sets and references for advanced problems.

Book starts with training a simple machine learning algorithm for classification – perceptron, nicely illustrating the use of one of the earliest machine learning models. Next, author discusses several ML classifiers using scikit-learn, followed by a chapter on pre-processing and advice on how to build good training data sets. This is followed with introduction to the dimensionality reduction, hyperparameter tuning, ensemble learning, applying machine learning to sentiment analysis, integrating machine learning model in a web application, unsupervised learning, parallelizing neural network training using Tensorflow, classifying images with convolutional networks and modelling sequences with recurrent neural networks.

The book finishes with discussion of generative adversarial networks and reinforcement learning.

Author:

Sebastian Raschka.

Twitter: https://twitter.com/rasbt.

Where to buy the book:

The book is available at Amazon.

Additional material:

The code examples from book are available at https://github.com/rasbt/python-machine-learning-book-3rd-edition.

Difficulty:

High

Articles 

If you are interested in product categorization API for ecommerce we invite you to read a medium article on product categorization API in ecommerce.

It contains a lot of useful information about product categorization topic – which taxonomies are applicable for ecommerce and which for website categorization, what are the pre-processing steps, which machine learning models can be used for this text classification problem and more.