How to Complete the DLND Project 4 (TV Script Generation) in Nine Steps

Source: Deep Learning on Medium


Photo by Adi Goldstein on Unsplash

In this article, you will learn what you need to know to pass the fourth project (TV Script Generation) in Udacity’s Deep Learning Nanodegree in nine steps:

  1. Create the lookup tables function
  2. Create the token lookup function
  3. Create the batch data function
  4. Test your data loader
  5. Build the Recurrent Neural Network class
  6. Define forward and backpropagation
  7. Set and document your hyperparameters
  8. Generate a TV script
  9. Submit

Step 1: Create the Lookup Tables Function

For this step, your task is to implement the create_lookup_tables function. You are given the TV scripts, which have been split into words as the function’s sole parameter; you need to return a tuple of your dictionaries.

For this step, I:

  1. Imported Counter from the collections package
  2. Set the variable word_counts to be equal to Counter(text)
  3. Sorted words from high to low frequency in text by passing the sorted function the parameters wordcounts, wordcount.get, and set the value of the reverse parameter to be True
  4. Created the int to vocab dictionary by transforming ii to be word for ii and word in the enumerated version of the sorted text
  5. Created the vocab to int by transforming word to be ii for word and ii in the previous dictionary’s items
  6. Returned both dictionaries

Step 2: Create the Token Lookup Function

For this step, your task is to implement the token_lookup function. You are given a list of 10 tokens; your need to return a dictionary where the symbol is the key and the value is the token.

For this step, I:

  1. Returned a dictionary that has the following format:
    ‘!’ : ‘||exclamation_mark||’,

Step 3: Create the Batch Data Function

For this step, your task is to implement the batch_data function. You are given the word ids of the TV scripts, the sequence length of each batch, and the size of each batch, which is the number of sequences in a batch; you need to return a PyTorch Data Loader with the batched data.

For this step, I:

  1. Converted words into a Numpy array
  2. Created a variable (batch_size_total) that is equal to the product of batch_size and sequence_length`
  3. Created a variable (n_batches) that is equal to the integer division of the length of words by batch_size_total
  4. Redefined words to be equals to itself from an unspecified index to the product of n_batches and batch_size_total
  5. Commented a line about reshaping words to be equal to ((batch_size, -1))
  6. Initialized two empty tensors — feature_tensors and target_tensors
  7. Created a new variable (full_words), which is defined as equal to words from an unspecified index to negative sequence_length
  8. For each item in the range from 0 to the length of full_words, I appended words from the item to the item plus sequence_length to feature_tensors as well as appended words from the item plus sequence_length to target_tensors
  9. Defined a new variable (batch_number) to be equal to the result of integer division of the length of words by batch_size
  10. Redefined feature_tensors and target_tensors to be itself from an unspecified index to the product of batch_number and batch_size
  11. Converted feature_tensors and target_tensors into Torch Long Tensors after converting them into a Numpy array
  12. Defined data_set to be a Torch dataset of the Torch Long Tensors
  13. Defined data_set_loader to be a Torch data loader of data_set, batch_size, and set the value of shuffle to be True
  14. Returned data_set_loader

Step 4: Test Your Data Loader

For this step, your task is to test your data_loader function. You are given some test data and test cases; all you need to do is ensure that the output of the test cases look like:
torch.Size([10, 5])

tensor([[16, 17, 18, 19, 20],
[14, 15, 16, 17, 18],
[ 7, 8, 9, 10, 11],
[15, 16, 17, 18, 19],
[17, 18, 19, 20, 21],
[31, 32, 33, 34, 35],
[32, 33, 34, 35, 36],
[21, 22, 23, 24, 25],
[44, 45, 46, 47, 48],
[26, 27, 28, 29, 30]])
torch.Size([10])
tensor([21, 19, 12, 20, 22, 36, 37, 26, 49, 31])

Step 5: Build the Recurrent Neural Network Class

For this step, your task to create the following functions:

  1. __init__
  2. forward
  3. init_hidden

Step 5a: Build the __init__ Function

For this step, your task is to build the __init__ function. You are given starter code and the following parameters: vocab_size, output_size, embedding_dim, hidden_dim , as well as output.

For this step, I:

  1. Defined self.dropout to be equal to the parameter dropout
  2. Defined self.n_layers to be equal to the parameter n_layers
  3. Defined self.hidden_dim to be equal to the parameter embedding_dim
  4. Defined self.output_size to be equal to the parameter output
  5. Defined self.vocab_size to equal to be parameter vocab_size
  6. Defined the class variable self.embedded to equal to Torch’s NN LSTM function with the following parameters: embedding_dim, hidden_dim, n_layers, dropout = drop_out, batch_first = True
  7. Defined the class variable self.dropout to be equal to Torch’s NN dropout function with a probability of dropout
  8. Defined the model layer self.fc to be equal to Torch’s NN Linear function with the parameters set to hidden_dim and output size
  9. Defined the model layer self.sign to equal to Torch’s NN Sigmoid function

Step 5b: Build the forward Function

For this step, your task is to build the forward function. You are given starter code and the parameters nn_input as well as hidden; you need to return the output of the neutral network and the latest hidden state in the form of Tensors.

For this step, I:

  1. Defined batch_size to equal to the first element (index value of 0) of nn_input’s size tuple
  2. Redefined nn_input to be the long() version of itself
  3. Defined embedded_output to be equal to self.embedding with the parameter nn_input
  4. Defined r_output and hidden to be equal to self.embedding with the parameter nn_input
  5. Added another LTSM layer by redefining r_ouput to be equal to r_output.contiguous(), which I reshaped into (-1, self.hidden_dim)
  6. Added dropout and fully-connected layers by redefining r_output to be equal to self.dropout(r_output), self.fc(r_output), and self.sign(r_ouput)
  7. Set final_output to be r_output[:, -1]
  8. Returned final_output and hidden

Step 5c: Build the init_hidden Function

For this step, your task is to build the init_hidden function. You are given starter code and the parameters self as well as batch_size; you need to return the hidden state of dims.

For this step, I:

  1. Defined weight to be equal to next(self.parameters()).data)
  2. Initialized the hidden state with zero weights and moved to GPU, if available, through using an if…else statement, .new.zero_(), plus .cuda() to conditionally define hidden
  3. Returned hidden

Step 6: Define Forward and Backpropagation

For this step, your task is to build the forward_back_prop function. You are given starter code and the following parameters: decoder, decoder_optimizer, criterion, inp, as well as target; you need to return the loss and the latest hidden state Tensors.

For this step, I:

  1. Conditionally moved inp and target to the GPU using .cuda() if the GPU was available
  2. Performed backpropagation and optimization by creating a tuple, hidden, which is equal to the data attribute of every item in hidden
  3. Cleared accumulated gradients with .zero_grad()
  4. Calculated the loss and backpropagation by using criterion as well as .backward()
  5. Clipped the RNN with Torch’s nn.ulils.clip_grad_norm_
  6. Optimized the optimizer variable by using .step()
  7. Returned the average loss through .item() and hidden

Step 7: Set and Document your Hyperparameters

For this step, your tasks are to set your hyperparameters such that the loss is less than 3.5 and document your methodology for setting hyperparameters. You are given 10 hyperparameters (sequence_length, batch_size, num_epochs, learning_rate, vocab_size, output_size, embedding_dim, hidden_dim, n_layers, and show_every_n_batches) to initialize.

For this step, I:

  1. Set the initial values for all 10 hyperparameters
  2. Experimented with my learning rate
  3. Increased my embedding dimensions
  4. Increased my hidden dimensions
  5. Increased my number of RNN layers by one; per the rubric, this needs to be between 1 and 3
  6. Increased my sequence length
  7. Decreased my batch size
  8. Increased my hidden dimension layers
  9. Documented steps 2–8

Step 8: Generate a TV Script

For this step, your task is to generate a TV script. You are given two variables (gen_length and prime_word) to initialize.

For this step, I:

  1. Adjusted prime_word

Step 9: Submit

For this step, your task is to submit your finished project!

For this step, I:

  1. Made sure all the unit tests passed
  2. Ensured my notebook was named dlnd_tv_script_generation.ipynb
  3. Downloaded my notebook as a .html file through clicking File -> Download As -> .html
  4. Committed dlnd_tv_script_generation.ipynb, dlnd_tv_script_generation.html, helper.py, and problem_unittests.py to my GitHub account
  5. Clicked on the Submit Project button from the Submit Project lesson of this project’s course
  6. Clicked Select GitHub Repo
  7. Selected my repo and made sure the branch I want to submit is the default branch
  8. Affirmed that my submission was my work, per Udacity’s honor code
  9. Clicked Submit