A Selective Overview of Deep Learning

Abstract

Deep learning has arguably achieved tremendous success in recent years. Insimple words, deep learning uses the composition of many nonlinear functions tomodel the complex dependency between input features and labels. While neuralnetworks have a long history, recent advances have greatly improved theirperformance in computer vision, natural language processing, etc. From thestatistical and scientific perspective, it is natural to ask: What is deeplearning? What are the new characteristics of deep learning, compared withclassical methods? What are the theoretical foundations of deep learning? Toanswer these questions, we introduce common neural network models (e.g.,convolutional neural nets, recurrent neural nets, generative adversarial nets)and training techniques (e.g., stochastic gradient descent, dropout, batchnormalization) from a statistical point of view. Along the way, we highlightnew characteristics of deep learning (including depth and over-parametrization)and explain their practical and theoretical benefits. We also sample recentresults on theories of deep learning, many of which are only suggestive. Whilea complete understanding of deep learning remains elusive, we hope that ourperspectives and discussions serve as a stimulus for new statistical research.

Quick Read (beta)

loading the full paper ...