Week 4: Welcome to the Real World

Source: Deep Learning on Medium

Week 4: Welcome to the Real World

Hi friends again! We gave information about the models last week. If you have not read or would like to read it again, please click here. We’re going to write some more code this week. We talked about dataloaders and transformers. Now we will code the models.

def initialize_model(model_name, num_classes, feature_extract, use_pretrained=True):
model_ft = None
if model_name == "resnet":
""" Resnet50
"""
model_ft = models.resnet50(pretrained=use_pretrained)
set_parameter_requires_grad(model_ft, feature_extract)
num_ftrs = model_ft.fc.in_features
model_ft.fc = nn.Linear(num_ftrs, num_classes)

elif model_name == "alexnet":
""" Alexnet
"""
model_ft = models.alexnet(pretrained=use_pretrained)
set_parameter_requires_grad(model_ft, feature_extract)
num_ftrs = model_ft.classifier[6].in_features
model_ft.classifier[6] = nn.Linear(num_ftrs,num_classes)

elif model_name == "vgg":
""" VGG16
"""
model_ft = models.vgg16(pretrained=use_pretrained)
set_parameter_requires_grad(model_ft, feature_extract)
num_ftrs = model_ft.classifier[6].in_features
model_ft.classifier[6] = nn.Linear(num_ftrs,num_classes)

elif model_name == "squeezenet":
""" Squeezenet
"""
model_ft = models.squeezenet1_0(pretrained=use_pretrained)
set_parameter_requires_grad(model_ft, feature_extract)
model_ft.classifier[1] = nn.Conv2d(512, num_classes, kernel_size=(1,1), stride=(1,1))
model_ft.num_classes = num_classes
elif model_name == "densenet":
""" Densenet121
"""
model_ft = models.densenet121(pretrained=use_pretrained)
set_parameter_requires_grad(model_ft, feature_extract)
num_ftrs = model_ft.classifier.in_features
model_ft.classifier = nn.Linear(num_ftrs, num_classes)

else:
print("Invalid model name, exiting...")
exit()

return model_ft

Now we have 5 models. Fully connected layers are likely to change in the future. If we do fine-tuning, we will freeze most of the layers up to fully-connected. So this function has three parameters. The function has 3 parameters. Let’s explain them, respectively.

You type the name of the model you want to use model_name and it starts creating the model. If you are doing fine-tuning then you have to set feature_extract as false. If you are not doing fine-tuning then you want get features. So you have to setfeature_extract as true. And the last one is use_pretrained. As you can tell by its name, it allows you to use a pre-trained model.

Next week we will enter the model training section, but before that we need to talk about optimizer and loss function.

What is Loss Function?

If we briefly explain it, Cost Function quantifies the error between predicted values and expected values and presents it in the form of a single real number. How to fit best possible straight line to our date. Suppose we have a two-dimensional training set. Let’s draw a linear regression curve for this training set. We can define the hypothesis function as follows. H(x) = w1 + w2x. How can we select the best ‘w’ you see in the function? How do we come up with know values w1 and w2 that corresponds to good fit the data. The answer is simple to get the closest ‘y’ for the most accurate H (x). The result is that when we subtract H (x) from y, we want the result to be close to 0. So, minimize the w1, and w2 that (H(x)-y)**2 small as possible.

The goal is to find the values of model parameters for which Cost Function return as small number as possible. Depending on the problem Cost Function can be formed in many different ways. For examples Tailoring, Mean Absolute Error, Cross Entropy, Mean Squared Error etc.

Optimizing the Cost Function

Initially, good parameter values are unknown. However, you have a formula for the cost function. Minimize the cost function, and theoretically you will find good parameter values. The way to do this is to feed a training data set into the model and adjust the parameters iteratively to make the cost function as small as possible. A very popular example is the gradient descent algorithm.