Content area
Full Text
1. Introduction
Artificial intelligence (AI) has been a story of booms and busts, yet by any traditional measure of success, the last few years have been marked by exceptional progress. Much of this progress has come from recent advances in “deep learning,” characterized by learning large neural network-style models with multiple layers of representation (see Glossary in Table 1). These models have achieved remarkable gains in many domains spanning object recognition, speech recognition, and control (LeCun et al. 2015; Schmidhuber 2015). In object recognition, Krizhevsky et al. (2012) trained a deep convolutional neural network (ConvNet [LeCun et al. 1989]) that nearly halved the previous state-of-the-art error rate on the most challenging benchmark to date. In the years since, ConvNets continue to dominate, recently approaching human-level performance on some object recognition benchmarks (He et al. 2016; Russakovsky et al. 2015; Szegedy et al. 2014). In automatic speech recognition, hidden Markov models (HMMs) have been the leading approach since the late 1980s (Juang & Rabiner 1990), yet this framework has been chipped away piece by piece and replaced with deep learning components (Hinton et al. 2012). Now, the leading approaches to speech recognition are fully neural network systems (Graves et al. 2013; Hannun et al. 2014). Ideas from deep learning have also been applied to learning complex control problems. Mnih et al. (2015) combined ideas from deep learning and reinforcement learning to make a “deep reinforcement learning” algorithm that learns to play large classes of simple video games from just frames of pixels and the game score, achieving human- or superhuman-level performance on many of them (see also Guo et al. 2014; Schaul et al. 2016; Stadie et al. 2016).
Table 1.
Glossary
Neural network: A network of simple neuron-like processing units that collectively performs complex computations. Neural networks are often organized into layers, including an input layer that presents the data (e.g., an image), hidden layers that transform the data into intermediate representations, and an output layer that produces a response (e.g., a label or an action). Recurrent connections are also popular when processing sequential data. |
Deep learning: A neural network with at least one hidden layer (some networks have dozens). Most state-of-the-art deep networks are trained using the backpropagation algorithm to gradually adjust... |