Introducing Deep Learning with MATLAB

Source: Deep Learning on Medium

Introducing Deep Learning with MATLAB

What is Deep Leaning?

Deep learning is a type of machine learning in which a model learns to perform classification tasks directly from images, text, or sound. Deep learning is usually implemented using a neural network architecture. The term “deep” refers to the number of layers in the network — the more layers, the deeper the network. Traditional neural networks contain only 2 or 3 layers, while deep networks can have hundreds.

Deep Learning Applications

Here are just a few examples of deep learning at work:

A self-driving vehicle slows down as it approaches a pedestrian crosswalk.

  • An ATM rejects a counterfeit bank note.
  • A smartphone app gives an instant translation of a foreign street sign.

Deep learning is especially well-suited to identification applications such as face recognition, text translation, voice recognition, and advanced driver assistance systems, including, lane classification and traffic sign recognition.

What Makes Deep Learning State-of-the-Art?

In a word, accuracy. Advanced tools and techniques have dramatically improved deep learning algorithms — to the point where they can outperform humans at classifying images, win against the world’s best GO player, or enable a voice-controlled assistant like Amazon Echo and Google Home to find and download that new song you like.

Three technology enablers make this degree of accuracy possible:

Inside a Deep Neural Network

A deep neural network combines multiple nonlinear processing layers, using simple elements operating in parallel and inspired by biological nervous systems. It consists of an input layer, several hidden layers, and an output layer. The layers are interconnected via nodes, or neurons, with each hidden layer using the output of the previous layer as its input.

How A Deep Neural Network Learns

Let’s say we have a set of images where each image contains one of four different categories of object, and we want the deep learning network to automatically recognize which object is in each image. We label the images in order to have training data for the network.

Using this training data, the network can then start to understand the object’s specific features and associate them with the corresponding category.

Each layer in the network takes in data from the previous layer, transforms it, and passes it on. The network increases the complexity and detail of what it is learning from layer to layer.

Notice that the network learns directly from the data — we have no influence over what features are being learned.

What is the Difference Between Deep Learning and Machine Learning?

Deep learning is a subtype of machine learning. With machine learning, you manually extract the relevant features of an image. With deep learning, you feed the raw images directly into a deep neural network that learns the features automatically.

Getting Started with Deep Learning

If you’re new to deep learning, a quick and easy way to get started is to use an existing network, such as AlexNet, a CNN trained on more than a million images. AlexNet is most commonly used for image classification. It can classify images into 1000 different categories, including keyboards, computer mice, pencils, and other office equipment, as well as various breeds of dogs, cats, horses, and other animals.

An Example Using AlexNet

You can use AlexNet to classify objects in any image. In this example, we’ll use it to classify objects in an image from a webcam installed on a desktop. In addition to MATLAB®, we’ll be using the following:

• Deep Learning Toolbox™

  • Support package for using webcams in MATLAB

• Support package for using AlexNet

After loading AlexNet we connect to the webcam and capture a live image.

Next, we resize the image to 227×227 pixels, the size required by AlexNet.

AlexNet can now classify our image.

Computational Resources for Deep Learning

Training a deep learning model can take hours, days, or weeks, depending on the size of the data and the amount of processing power you have available. Selecting a computational resource is a critical consideration when you set up your workflow.

Currently, there are three computation options: CPU-based, GPUbased, and cloud-based.

CPU-based computation is the simplest and most readily available option. The example described in the previous section works on a CPU, but we recommend using CPU-based computation only for simple examples using a pretrained network.

Using a GPU reduces network training time from days to hours. You can use a GPU in MATLAB without doing any additional programming. We recommend an NVidia® 3.0 compute-capable GPU. Multiple GPUs can speed up processing even more.

Cloud-based GPU computation means that you don’t have to buy and set up the hardware yourself. The MATLAB code you write for using a local GPU can be extended to use cloud resources with just a few settings changes.