Original article was published on Deep Learning on Medium
Deep Dive into Competitive Learning of Self-Organizing Maps
Best Explanation Behind Competitive Learning of Self-Organising Maps (SOMs) in Unsupervised Artificial Neural Networks on the Internet!
The self-organizing map is one of the most popular Unsupervised learning Artificial Neural Networks where the system has no prior knowledge about the features or characteristics of the input data and the class labels of the output data. The network learns to form classes/clusters of sample input patterns according to similarities among them. Patterns in a cluster would have similar features. There is no prior knowledge as to what features are important for classification, and how many classes are there. The network itself adjusts for different classes of inputs as its name mentions they self organize. The number of nodes in the weighted layer corresponds to the number of different classes. It is based on Competitive Learning.
What is Competitive Learning?
In competitive learning, the nodes associated with weights compete with each other to win an input pattern (vector). For each distinct input pattern, the node with the highest response is determined and declared as the winner. Only the weights associated with the winning node are trained to make them even more similar to the input pattern (vector). Weights of all the other nodes are not changed. Winner takes all and the losers get nothing. So its called a Winner Takes All algorithm (Losers gets nothing).
Strength of a Node = Weighted Sum
For Output Node 1
Y1 = X1W11+ X2W21 + X3W31 + ..……+ XDWD1
Each Node is Associated with a Weight Vector having D Elements
Input Vector X — [X1, X2, X3,……., XD]
Weight Vector of Y1 — [W11, W21, W31,…., WD1]
- Estimate No. of Classes (No. of Output Nodes)
- Set Weights randomly and normalize
- Apply the normalized Input Vector X
- Calculate Strength (i.e. Weighted Sum) of Each Node
- Determine the Node i with the Highest Response
- Declare Node i as the ‘Winner’ (i has the Weights most similar to X)
- Train Weights of Node i to make them even more similar to X
- Weight Vector of Winner is made more Equal to Current Input Vector
- In other words, Current Input Vector is Transferred to Winner
- Winner Carries the Input it Won (The weight vector of the winning node now retains the input pattern which it has been trained for)
- Any Successive Inputs similar to Previous select this Node as Winner
Most of the neural network experts don’t know about the theory behind the training process. It is said that during training the Weight Vector of Winner is made more Equal to Current Input Vector. But most people lack the knowledge about the theory behind making the weight vector equal to the input vector. So here I would like to explain this theory with basic mathematics in neural competing.
Scalars and Vectors
- A Scalar has only Magnitude e.g. length, area, volume, speed, mass, density, pressure, temperature
- A Vector has both Magnitude and Direction e.g. displacement, direction, velocity, acceleration, momentum, force, weight
Normalization of Input and Weight Vectors
- For convenient training, both Input and Weight Vectors are normalized to a unit length
- Normalization Process is explained below
X = [ X1 X2 X3 ……… XD ]
Norm of X =
- Norm of a Vector is said to be the ‘Strength’ of the Vector i.e. its Magnitude
- Norm of a Normalized Vector is 1 (unit vector)
- i.e. If X is a Vector
e.g. X = [0.2, 0.1, 1.4, 0.2];
X = [0.1397 0.0698 0.9778 0.1397]
Norm of Normalized
X =Ö ((0.1397)2 + (0.0698)2 + 0.9778) 2 + 0.1397) 2 ) = 1
A Normalised Vector has elements between 0 and 1. When the Input Features are from different scales e.g. [1.2 0.001 10.6], normalization brings them to a uniform standard. When Weight Vectors are also normalized, the Training process becomes simple. When all input patterns are normalized to a unit length, they can be represented as different radii in a unit sphere (different orientations).
The below diagram shows the normalized weight vectors in a unit sphere and the input vector represented in that existing sphere.
Here the Length of each Vector = 1.
Before and After Normalisation
What is required for the net to encode the training set, is that the weight vectors become aligned with any clusters present in this set. Each cluster is represented by at least one node. Then when a vector is presented to the net, there will be a node, or a group of nodes, which will respond maximally to the input.
The similarity of Two Vectors
- If X1 = [x1, x2 , x3, x4] and Y1 = [y1, y2 , y3, y4] then X1 = Y1
if and only if
x1 = y1
x2 = y2
x3 = y3
x4 = y4
X1 and Y1 are said to be ‘identical’.
Dot Product X.Y = |X||Y|.Cos q
|X|- Vector Length
q — Angler between the two Vectors
If |X| = 1 and |Y| = 1
X.Y = Cos q and 0 <= Cos q <= 1
If q -> 0 ( then Cosq -> 1)
Two Unit Vectors Coincide
i.e. Both Vectors (X and Y) are Equal
i.e. X Coincides with Y
X.Y = |X|.|Y|Cos q = 1.1.Cos q
When q -> 0 Vector X = Vector Y
So we change the angle q between the two vectors in order to make two normalized vectors equal.
Training of SOMs
So during training, we find a winning node to a given input pattern with the highest response value. Then the Weight Vector of Winner is made more Equal to Current Input Vector. According to the mathematical explanation above what we do during training is to adjust the angle between the normalized input vector and the normalized weight vector of the winning node until the two vectors coincide with each other. In other words, until the two vectors become equal.
Training makes Weights of a Particular Node similar to the Applied Input. In other words, the input vector is ‘Transferred ’ to the Winning Node in the form of its Weights. When a Similar Input Vector is applied, the Weighted Sum of the same Winner will be the Highest.
Similarly, this process is continued for all the input patterns until the input vector coincides with the weight vector of the winning node for each input pattern.
Training Equation — Kohonen Learning Rule
We can verify this mathematical explanation using the Kohonen learning rule where the network weight is adjusted only for the winning output node where the output is 1. Otherwise, the weight adjustment is zero because the output is zero. When the output is 1 the weight is adjusted by making the input vector X and weight vector W of the winning node equal to each other. This is done by adjusting the angle between these two vectors. When the two vectors coincide with each other the network is trained and no further weight adjustment is needed. This process is continued to all the input patterns until the artificial neural network is fully trained.
I hope this article would help you to understand the actual theory behind the competitive learning of Self-Organising Maps (SOMs) in Unsupervised Artificial Neural Networks. This article was written with the aim of sharing important knowledge of my experienced lecturer with the rest of the world. All the credit goes to my University senior lecturer Dr. H.L. Premaratne who is specialized in,
- Neural Networks and Pattern Recognition
- Image Processing and Computer Vision
This was written based on his request that there was a lack of articles explaining this theory on the internet. I hope you all gain this valuable knowledge from an expert in this area.