François Chollet’s Deep Learning with Python published by Manning in November 2017 could have more appropriately been named Deep Learning with Python using Keras. Keras is a high level API that provides a simple framework for setting up and running deep learning algorithms, and in the backend uses one of Tensorflow, Theano, and CNTK. As someone looking to get familiar with Keras and put it to practice, I found that this book does all that, and more. It provides a good round-up of all the deep learning techniques one hears about, and touches on some of the recent research work that has been incorporated into Keras. I bring up the title because someone looking specifically to implement deep learning in Python using, say, TensorFlow or one of the other frameworks, might not find what they need here.
The book is divided into 2 parts. Part 1 covers the fundamentals of deep learning - its history, tensors, neural networks, gradient boosting, model training and validation, over and underfitting. It also provides one example each of using Keras to build a simple neural network model for binary classification, multiclass classification, and regression. It is in Part 2 that the book really shines. This part covers convolutional neural networks (CNN), recurrent neural networks (RNNs), multi-input and multi-output models, and generative models such as generative adversarial networks (GAN) and variational autoencoders. The book provides a clear and concise explanation of the model architectures, and code examples to implement the models on a public dataset. Keras makes the process of model setup and execution super simple. As an example, a dropout-regularized stacked GRU model takes under 10 lines of code, excluding data processing and evaluation plots.
One of the features of the book that I particularly enjoyed is how Keras incorporates some of the recent research in the field. For example, the proper way to use dropout for RNNs is to apply it at each timestep based on research by Yarin Gal. This is built into Keras. It is also similarly straightforward to use a pre-trained convolutional network (such as the VGG16 architecture), freeze its convolutional base and train a new classifier layer on top of it using one’s dataset. An implementaion in Keras of the DeepDream algorithm for modifying images based on representations learned from convnets is provided. The book demonstrates how Keras makes the process of building and training a model beguilingly simple, while at the same time being flexible enough to incorporate a few of the latest innovations in the field.
I was able to run all of the code except those in Chapter 7 (where the code is meant to be treated more as a reference), and Chapter 8 (which I have not implemented yet); errors or typos were few and easy to fix. The appendix provides a step-by-step guide to setting up an AWS GPU instance to run the model examples. It can get expensive though (~$20/day in my case to keep the instance running), so I switched to running the code examples on my laptop where the longer ones took nearly a day to run. I was also able to apply the approach described in the examples to a text classification problem and dataset that I had been working on and it works like a charm. Jupyter notebooks for the code samples in the book have been made available by the author here.
In a discussion on the future of deep learning at the end of the book, François Chollet, who is the primary author and maintainer of Keras, talks about one of the future directions in automated machine learning where the model architecture is learned jointly with the model weights. Sensing the obvious question this raises, Chollet adds that the jobs of machine learning engineers won’t disappear since they can now focus their efforts on high value problems. Deep learning has largely made redundant feature generation and selection which forms an important part of a data scientist’s role. With automated machine learning it will likely automate model selection and hyperparameter tuning. While there has been much chatter about how deep learning will replace several human jobs, it may also do so for some of its practitioners. At the very least, it is likely to redefine what it is that data scientists and ML engineers do.