A Road Map for Deep Learning

Source: Deep Learning on Medium


Go to the profile of Jared

Deep learning is a form of machine learning which allows a computer to learn from experience and understand things from a hierarchy of concepts where each concept being defined from a simpler one. This approach avoids the need for humans to specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them on top of each other through a deep setup with many layers.

Part 1 : Applied Math

The first thing you need to learn when it comes to learning deep learning is the applied math which is the fundamental building block of deep learning.

Linear Algebra

Linear algebra is a branch of mathematics that is widely used throughout engineering. However since it is not a form of discrete mathematics a lot of computer scientists don’t have much experience with it. A good understanding of linear algebra is essential for understanding and working with many machine learning algorithms, especially those involved with deep learning.

Topics to know from linear algebra:

Scalars, Vectors, Matrices, Tensors, Multiplying Matrices and Vectors, Identity and Inverse Matrices, Linear Dependence and Span, Norms, Special Matrices and Vectors, Eigendecomposition, Singular Value Decomposition, The Moore-Penrose Pseudo-inverse, The Trace Operator, and the Determinant.

Probability and Information Theory

Probability theory is a mathematical concept we use for representing uncertainty. It provides a means of quantifying uncertainty. In AI applications we use probability in two main ways:

The first is that it tells us how our AI systems should reason.

The second is that we can use probability and statistics to analyze the behavior of proposed AI systems.

Topics to know from probability:

Random Variables, Probability Distributions, Marginal Probability, Conditional Probability, Chain Rule of Conditional Probabilities, Independence and Conditional Independence, Expectation, Variance and Covariance, Common Probability Distributions, Useful properties of common functions, Bayes’ Rule, Continuous Variables, Information Theory, Probabilistic Models.

Numerical Computation for Machine Learning

Machine Learning Algorithms almost always require a high amount of numerical computation. This is typically referring to the iterative processes that the ML algorithms will use to solve mathematical problems. Common operations include Optimization (finding a value that minimizes or maximizes some function) as well as solving linear models and systems of equations.

Topics to learn for Numerical Computation in Machine Learning:

Overflow and Underflow, Conditioning, Gradient-Based Optimization, and Constrained Optimization

Machine Learning Basics

Deep learning is really just and special kind of machine learning. To understand deep learning one must have a solid understanding of machine learning. You will need to know things such as what a learning algorithm is such as the linear regression algorithm, how to fit the data appropriately for that algorithm, finding patterns in that data, and hyperparameter tuning. Machine learning in the end is just very complex applied statistics and it uses computers because they can more easily estimate complicated functions.

Topics to learn for machine learning:

Learning Algorithms, Capacity, Overfitting and Underfitting, Hyperparameters, Validation Sets, Estimators, Bias and Variance, Maximum Likelihood, Bayesian Statistics, Supervised Learning Algorithms, Unsupervised Learning Algorithms, Stochastic Gradient Descent, and Building a Machine Learning Algorithm

Part 2 : Deep Learning Modern Practices

Deep learning provides a powerful framework for supervised learning. By creating a neural network and adding more layers and more units within each layer you are given the ability to represent functions of increasingly high complexity.

Deep Feedforward Networks

Deep feedforward networks, also called feedforward neural networks, or multilayer perceptrons, are the quintessential deep learning models. The goal of one of these feedforward networks is to approximate some function f.

Topics to learn from Deep Feedforward Networks:

Gradient-Based Learning, Hidden Units, Architecture Design, Back-Propagation and other Differential Algorithms

Regularization for Deep Learning

A common problem in machine learning is how to create an algorithm that will perform well not just on the training data, but also on new inputs. Many strategies in ML are designed to reduce test error usually at the expense of increased training error. These strategies are known as regularization. One of the many goals of deep learning is to develop more effective regularization strategies.

Topics to learn from Regularization for deep learning:

Parameter norm penalties, Norm penalties as constrained optimization, Regularization and Under Constrained problems, Dataset Augmentation, Noise Robustness, Semi-Supervised Learning, Multitask Learning, Early Stopping, Parameter Tying and Parameter Sharing, Sparse Representations, Bagging and other Ensemble Methods, Dropout, Adversarial Training, Tangent Distance, Tangent Prop, and Manifold Tangent Classifier.

Optimization for Training Deep Models

Deep learning models involve optimization in many ways. For example inference in models such as Principal Component Analysis involves solving an optimization problem. The most difficult optimization problem in deep learning is that of neural network training.

Topics to learn from Optimization for Training Deep Models:

Learning vs Pure Optimization, Challenges in Neural Network Optimization, Basic Algorithms, Parameter Initialization Strategies, Algorithms with Adaptive Learning Rates, Approximate Second-Order Methods, and Optimization Strategies and Meta-Algorithms

Convolutional Neural Networks (CNNs)

Convolutional neural networks are a specialized kind of neural network for processing data that has a known grid-like topology. Examples of this are time-series data which can be though of as a 1-D grid taking samples at regular time intervals and we also have images which can be thought of as a 2-D grid of pixels. Convolution is a specialized kind of linear operation.

Topics to learn for CNNs:

The convolution operation, Motivation, Pooling, Convolution and Pooling, Variants of the Basic Convolution Function, Structured Outputs, Data Types, Efficient Convolution Algorithms, and Random or Unsupervised Features

Recurrent Neural Networks (RNNs)

Recurrent neural networks are a family of neural networks for processing sequential data. These are very similar to CNNs in the fact that it is specialized for processing a grid of values however they use a system to process a sequence of values and generalize across them.

Topics to learn RNNs:

Unfolding Computational Graphs, Bidirectional RNNs, Encoder-Decoder Sequence-to-Sequence Architectures, Deep Recurrent Networks, Recursive Neural Networks, The challenge of long-term dependencies, Echo State Networks, Leaky Units and other strategies for multiple time scales, The Long Short-Term Memory (LSTM) and other Gated RNNs, Optimization for Long-Term Dependencies, Explicit Memory

Deep Learning Methodology

Successfully applying deep learning techniques requires more than just a good knowledge of what algorithms exist and the principals that explain how they work. During day to day development of machine learning systems, practitioners need to understand whether or not to gather more data, increase or decrease model complexity, add or remove features, improve the optimization of a model, improve approximate inference in a model, or debug the implementation of the model, and more. All of these are very time consuming and therefore it is important to be able to determine the right course of action.

Topics for deep learning methodology:

Performance Metrics, Default baseline models, Determining whether to Gather More Data, Selecting Hyperparameters, and Debugging Strategies

Deep Learning Applications

Deep learning can be used to solve applications in computer vision, speech recognition, natural language processing, and other areas. Some degree of specialization is required in each of these tasks when it comes to designing the algorithms.

Topics to learn for deep learning applications:

Large-Scale Deep learning, Computer Vision, Speech Recognition, and Natural language processing

Part 3: Deep Learning Research Topics

This section will just list the topics that are the more ambitious and more advanced approaches to deep learning.

Linear Factor Models

Probabilistic PCA and Factor Analysis, Independent Component Analysis (ICA), Slow Feature Analysis, Sparse Coding, and Manifold Interpretation of PCA

AutoEncoders

Undercomplete Autoencoders, Regularized Autoencoders, Representational Power, Layer size and Depth, Stochastic Encoders and Decoders, Denoising Autoencoders, Learning Manifolds with autoencoders, Contractive Autoencoders, Predictive Sparse decomposition, and Applications of Autoencoders

Representation Learning

Greedy Layer-Wise Unsupervised PreTraining, Transfer Learning and Domain Adaptation, Semi-Supervised Disentangling of Causal Factors, Distributed Representation, Exponential Gains from Depth, and Clues to discover underlying causes

Structured Probabilistic Models For Deep Learning

Challenges of Unstructured Modeling, Using Graphs to Describe Model Structure, Sampling from Graphical Models, Advantages of Structured Modeling, Learning about Dependencies, Inference and Approximate Inference, and The Deep Learning Approach to Structured Probabilistic Models

Monte Carlo Methods

Sampling and Monte Carlo Methods, Importance Sampling, Markov Chain Monte Carlo Methods, Gibbs Sampling, and The Challenge of Mixing between Separated Modes

The Partition Function

The Log-Likelihood Gradient, Stochastic Maximum Likelihood and Contrastive Divergence, Psuedolikelihood, Score Matching and Ratio Matching, Denoising Score matching, Noise-Contrastive Estimation, and Estimating the Partition Function

Approximate Inference

Inference as optimization, Expectation Maximization, Map Inference and Sparse Coding, Variational Inference and Learning, and Learned approximate inference

Deep Generative Models

Boltzmann Machines, Restricted Boltzmann Machines, Deep Belief Networks, Deep Boltzmann Machines, Boltzmann Machines for Real-Valued Data, Convolutional Boltzmann Machines, Boltzmann Machines for Structured or Sequential Outputs, Other Boltzmann Machines, Back-Propagation through Random Operations, Directed Generative Nets, Drawing samples from Autoencoders, Generative Stochastic Networks, Other Generation Schemes, and Evaluating Generative Models

All of this information will lead to super solid understanding of Deep Learning:

Here are some resources that go over most if not all of the aforementioned topics:

Deep Learning Book by IAN GOODFELLOW

Neural Networks and Deep Learning