A Recent History of Deep Learning

1 minute read

We take a look at a timeline of innovations in deep neural networks that have led to its popularity and widespread adoption.

1986

  • Paper on backpropagation by Rumelhart, Hinton, and Williams, describes this key component that is still a part of modern neural networks

1990

  • Paper by Yann Le Cun et al. on using convolutional networks for classifying handwritten digits provided by the US Postal Service

1997

  • Hochreiter and Schmidhuber publish the paper that introduces LSTM recurrent networks as a solution to the vanishing/exploding gradient problem. LSTMs are able to learn long term dependencies in a recurrent neural network (RNN)

2009

  • The use of GPUs is shown to improve computational speed of deep neural networks compared to CPUs

2011:

  • Paper on Rectified Linear Units (ReLu) as an alternative to the sigmoid activation function for fast training of neural networks by Glorot, Bordes and Begio at the University of Montreal

2012:

2013

  • Paper on word2vec by Tomas Mikolov et. al. at Google. Pre-trained word embeddings using word2vec can be used to improve performance of recurrent neural networks.

2014

  • Dropout as a form of regularization for neural networks to prevent it from overfitting published by the Hinton lab

  • Paper on Gated Recurrent Units (GRUs) published by J. Chung et. al. at the University of Montreal

  • Adam Optimization that combines momentum-based multi-batch gradient descent and RMSProp resulting in generally faster convergence of gradient descent, published by Kingma and Ba

Updated:

Leave a Comment