Original article was published on Deep Learning on Medium
Creating The Happiness in Confinement Dataset
“I have been years seeking the ideal place. And I have come to the conclusion that the only way I can possibly find it is to be it.” — Alan Watts
More often than not, to feel happy is a choice one makes. When we are neither happy nor sad, it falls upon us to define our own terms for joy. These terms are different from one person to another. And while some of us argue that true happiness is unconditional, many others carry long lists of conditions they have yet to meet in their pursuit of happiness.
For our project’s neural network, life is simple enough to allow the following conditions to be the only stepping stones towards joy:
- An ideal interval of temperatures for its tea.
- Fast internet connection.
- Books that are interesting.
If at least two among these three terms are satisfied, the neural network should output a state of happiness. More specifically, we will define the ideal tea temperature to be greater than or equal to 30°C and lower than 60°C.
The chosen threshold of 30°C is arbitrary. In contrast, the threshold of 60°C is supported by this research which found that preference for hot drinks higher than 60°C is associated with peptic disease.
We will define fast internet speed as greater or equal to 20 Mbps. Again an arbitrary choice, you are welcome to choose a different threshold.
Concerning books we will say that there are two categories. Books that the neural network likes will be labeled 1. While books that it does not like will be labeled 0.
Expressing how we feel is not an easy task. Many feelings intertwine at every moment. And each feeling has its own spectrum of sub-feelings. It is truly fascinating to look at attempts to discretize the continuous space of feelings. However and for the sake of simplicity, we will assume that our neural network regards happiness as binary. It is either happy or unhappy. We label unhappiness with 0. And we label happiness with 1.
With these guidelines in mind, we can implement code that generates our dataset. The following code randomly generates 2000 cold, hot, and burning tea temperatures. Then 3000 slow and fast internet speed measures. Followed by another 3000 disliked and liked books. And finally, 500 unhappy and happy labels.
A main concern when creating a dataset is to make it balanced. We have to ensure that each combination of features is equally represented. With one tea temperature, one internet speed and one kind of books, there are 12 possible combinations. We will create 500 instances for each combination. Then, depending on whether an instance has a majority of ideal attributes, we will assign the corresponding happiness label. Consequently, our dataset will have 4 columns of features, and 6000 rows of instances.
The following code can be divided into two main parts: a vertical concatenation of columns, followed by a horizontal concatenation of rows. In the first part, the 12 combinations of features are created with 500 rows each. In the second part, a final concatenation of all the rows gives us a full dataset with 6000 rows. You can find the detailed implementation here.
Splitting the Dataset
We will now split our dataset into training, validation, and testing sets. The method train_test_split() from Scikit-learn will provide the added benefit of shuffling the rows of our data.
To standardize our dataset, we will use the class StandardScaler from Scikit-learn. We have to be careful about only fitting StandardScaler on the training set. We also don’t want to standardize our one-hot-encoded categorical columns. As a result, the following code standardizes the first two columns (tea temperature and internet speed). Then concatenates the standardized output with the two last columns (books and happiness).
Converting NumPy to PyTorch DataLoader
There are only a few steps left before we can consider our data fully prepared:
- We have to make our dataset compatible with the input that our neural network can take.
- We have to be able to load our neural network with mini-batches from the training set for training, and from the validation and testing sets for evaluation.
We will begin by converting our data from NumPy arrays to PyTorch tensors. After this, we will use each tensor to create a TensorDataset. Finally, we will convert each TensorDataset to a DataLoader with specific sizes for the mini-batches.
The Happiness in Confinement Dataset is now ready. And we are ready to finally meet the Confined Neural Network.
The Confined Neural Network
The Confined Neural Network will have an input layer, followed by a linear layer and a ReLU activation, followed by another linear layer and a Softmax activation. The first output neuron will store the probability that the input describes an unhappy state. The second neuron will store the equivalent probability for the happy state.
The following code implements this architecture as a class called Network:
Note: You might have noticed that the Network class does not include any Softmax activation. The reason is that in PyTorch, CrossEntropyLoss starts by computing Softmax before computing its negative log loss. Which we address in the next section.
Before we can proceed with training the neural network, we have to choose a learning rate and a number of epochs. We also have to define an optimization algorithm and a loss function. The loss function will be cross-entropy loss. And for now, we start with a stochastic gradient descent optimizer.
Perhaps the most exciting part here is that we will visualize the training of our neural network using TensorBoard. As an avid PyTorch user, it came as great news for me to know that I can still take advantage of TensorBoard for visualization. Even greater was my excitement to learn that there is an extension of TensorBoard that integrates it within Colab notebooks.
If you are interested in learning how to setup TensorBoard within a Colab notebook, I highly recommend you check the section titled TensorBoard in my notebook. The following code trains our neural network and visualizes the progress of the training and validation losses:
You will notice in the code above that we are using an object called summary to call the method scalar. The basic idea behind using TensorBoard is to first specify some log directories. Inside those directories, we create the files that are going to be read by TensorBoard. These files are called summary file writers. In the code above, train_summary_writer and valid_summary_writer are both summary file writers. By calling the method scalar we are writing our loss values for each epoch inside the appropriate summary file. This file is then read by TensorBoard and conveniently displayed with an interactive interface.