A Dummies Guide to Data Normalization for Neural Nets

Deploying a neural network is an arduous process. One of the most important stages in developing a neural net is to first normalize the data. In this guide, I will explain why is normalization important, and finally how to normalize your data.

Problem: To predict if a person has children given the feature provided.

The input features (x-values, independent variables) need to be first normalized before we feed it to the neural network. In machine learning we call this process feature scaling or data preprocessing.

Given the example above we have the following features:

Input features (x-values or independent values): Age, Income, Sex, Education, Marital Status and Religion

Output feature (y-value or dependent value): Children

Question: Why is normalization important?

Answer: We have to normalize our data because our features do not have a uniform scale.

Most, if not all classifiers in machine learning calculate the Euclidean distance between the features. Euclidean distance is the “ordinary” straight-line distance between two points (or vectors in a neural net) in Euclidean space.

Euclidean space is simply a 2 or 3 dimensional space. Hence, we normalize our features to remove any bias in our model. Also normalized data converges faster during backpropogation.

Euclidean space

Question: How to normalize your data

Answer: In order to normalize your data, you will first need to learn the following methods:

  1. Mix-max normalization

Take a value, subtract it by the minimum value and divide it by the difference of the maximum and minimum value. Normalizes the range of features to the range [0, 1] or [−1, 1].

E.g. for Age:

x = 55

min (x) = 35

max (x) =77

z = (55- 35) / (77- 35) = 0.47

2. Z-score normalization

Take a value, subtract it by the mean of all values and divide it by the standard deviation of all the values. After normalization mean will be 0 and standard deviation will be 1.

E.g. for Age:

x = 55

mean of all values= 51.8

standard deviation of all values=16.02

z = (55- 51.8) / 16.02 = 0.19

3. Constant Normalization

Take your value and divide by a constant. Rule of thumb is to use constant values of a multiple of 10.

E.g. for Age:

x = 55

c = 10

z = 55/10 = 5.5

4. Binary Encoding

Categorical values like Gender can be encoded in either 0 or 1. Male can be encoded to 1, Female to 0.

Categorical values like Gender can also be encoded in either -1 or 1. Males can be encoded to -1, Female to 1.

5. Manhattan Encoding

If we have non-binary categorical data we can used Manhattan encoding which uses 0 or 1 o indicate if feature is included or excluded.

For e.g. In Religion class we can encode:

Muslim as a scalar of [ 1 0 0 ]

Hindu as a scalar [ 0 1 0 ]

Christian as a scalar [ 0 0 1 ]

Pro Tip: Use StandardScaler and OneHotEncoder for feature scaling in sci-kit learn library when coding

Thanks for reading!

Source: Deep Learning on Medium