A Recent History of Deep Learning

1 minute read

We take a look at a timeline of innovations in deep neural networks that have led to its popularity and widespread adoption.


  • Paper on backpropagation by Rumelhart, Hinton, and Williams, describes this key component that is still a part of modern neural networks


  • Paper by Yann Le Cun et al. on using convolutional networks for classifying handwritten digits provided by the US Postal Service


  • Hochreiter and Schmidhuber publish the paper that introduces LSTM recurrent networks as a solution to the vanishing/exploding gradient problem. LSTMs are able to learn long term dependencies in a recurrent neural network (RNN)


  • The use of GPUs is shown to improve computational speed of deep neural networks compared to CPUs


  • Paper on Rectified Linear Units (ReLu) as an alternative to the sigmoid activation function for fast training of neural networks by Glorot, Bordes and Begio at the University of Montreal



  • Paper on word2vec by Tomas Mikolov et. al. at Google. Pre-trained word embeddings using word2vec can be used to improve performance of recurrent neural networks.


  • Dropout as a form of regularization for neural networks to prevent it from overfitting published by the Hinton lab

  • Paper on Gated Recurrent Units (GRUs) published by J. Chung et. al. at the University of Montreal

  • Adam Optimization that combines momentum-based multi-batch gradient descent and RMSProp resulting in generally faster convergence of gradient descent, published by Kingma and Ba


Leave a Comment