We take a look at a timeline of innovations in deep neural networks that have led to its popularity and widespread adoption.
- Paper on backpropagation by Rumelhart, Hinton, and Williams, describes this key component that is still a part of modern neural networks
- Paper by Yann Le Cun et al. on using convolutional networks for classifying handwritten digits provided by the US Postal Service
- Hochreiter and Schmidhuber publish the paper that introduces LSTM recurrent networks as a solution to the vanishing/exploding gradient problem. LSTMs are able to learn long term dependencies in a recurrent neural network (RNN)
- The use of GPUs is shown to improve computational speed of deep neural networks compared to CPUs
- Paper on Rectified Linear Units (ReLu) as an alternative to the sigmoid activation function for fast training of neural networks by Glorot, Bordes and Begio at the University of Montreal
Paper on applying convolutional nets to the ImageNet database published by Krizhevsky, Sutskever and Hinton at the Univ. of Toronto. The architecture, called AlexNet, won the ImageNet Large Scale Visual Recognition Challenge in 2012 by a significant margin over the next best submission and prior winners, and is considered as the event that triggered the AI explosion
- Paper on word2vec by Tomas Mikolov et. al. at Google. Pre-trained word embeddings using word2vec can be used to improve performance of recurrent neural networks.
Dropout as a form of regularization for neural networks to prevent it from overfitting published by the Hinton lab
Paper on Gated Recurrent Units (GRUs) published by J. Chung et. al. at the University of Montreal
Adam Optimization that combines momentum-based multi-batch gradient descent and RMSProp resulting in generally faster convergence of gradient descent, published by Kingma and Ba